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Abstract 

We study a natural information dissemination problem for multiple mobile agents in a bounded 
Euclidean space. Agents are placed uniformly at random in the d-dimensional space {— n, ...,n} d at 
time zero, and one of the agents holds a piece of information to be disseminated. All the agents then 
perform independent random walks over the space, and the information is transmitted from one agent to 
another if the two agents are sufficiently close. We wish to bound the total time before all agents receive 
the information (with high probability). Our work extends Pettarin et al's work |[T3l , which solved the 
problem for d < 2. We present tight bounds up to polylogarithmic factors for the case d = 3. (While our 
results extend to higher dimensions, for space and readability considerations we provide only the case 
d = 3 here.) Our results show the behavior when d > 3 is qualitatively different from the case d < 2. 
In particular, as the ratio between the volume of the space and the number of agents varies, we show an 
interesting phase transition for three dimensions that does not occur in one or two dimensions. 
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1 Introduction 



We study the following information diffusion problem: let ai , &2 , . . . , a m be m agents initially starting 
at locations chosen uniformly at random in V d = {—n, , — (n — 1), . . . , n} d and performing independent 
random walks over this space. One of the agents initially has a message, and the message is transmitted 
from one agent to another when they are sufficiently close. We are interested in the time needed to flood 
the message, that is, the time when all agents obtain the message. In other settings, this problem has been 
described as a virus diffusion problem, where the message is replaced by a virus that spreads according to 
proximity. We use information diffusion and virus spreading interchangeably, depending on which is more 
useful in context. This is a natural model that has been extensively studied. For example, Alves et al. and 
Kesten et al. coined the name "frog model" for this problem in the virus setting, and studied the shape 
formed by the infected contour in the limiting case [Q~l|2][l0]. In the flooding time setting, early works used 
a heuristic approximation based on simplifying assumptions to characterize the dynamics of the spread of 
the message lf3l [TTl[T8l . More recent works provide fully rigorous treatments under this or similar random 
walk models (3 HO ED M (H- 

The most relevant recent works are those of Pettarin et al. lTT3Tl and Peres et al. Ifl2l[l6l . The work of 
Pettarin et al. examines the same model as ours, but their analysis is only for one- and two-dimensional 
grids. The work of Sinclair and Stauffer [16 ] considers a similar model they call mobile geometric graphs, 
and their work extends to higher dimensions. However, their focus and model both have strong differences 
from ours. For example, they assume a Poisson point process of constant intensity, leading to a number of 
agents linear in the size of the space. In contrast, our results allow a sublinear number of agents, a scenario 
not directly relevant to their model. Also, they focus on structural aspects on the mobile graphs, such as 
percolation, while we are primarily interested in the diffusion time. There are additional smaller differences, 
but the main point is that for our problem we require and introduce new techniques and analysis. 

Our paper presents matching lower bounds and upper bounds (up to polylogarithmic factors) for the 
flooding problems in d-dimensional space for an arbitrary constant d. For ease of exposition, in this paper 
we focus on the specific case where d = 3, which provides the main ideas. Two- and three- dimensional 
random walks have quite different behaviors - specifically, two-dimensional random walks are recurrent 
while three-dimensional random walks are transient - so it is not surprising that previous results for two 
dimensions fail to generalize immediately to three-dimensional space. Our technical contributions include 
new techniques and tools for tackling the flooding problem by building sharper approximations on the effect 
of agent interactions. The techniques developed in this paper are also robust enough that our results can be 
extended to variations of the model, such as allowing probabilistic infection rules, replacing discrete time 
random walks by continuous time Brownian motions, or allowing the agents to make jumps 0. These 
extensions will be reported in future work. 

Although the information diffusion problem in three or more dimensions appears less practically rel- 
evant than the two-dimensional case, we expect the model will still prove valuable. For instance, particles 
in a high dimensional space may provide a latent-space representation of the agents in a dynamic social 
network (HO, so understanding information diffusion process may be helpful for designing appropriate 
latent space models in the future. Also, the problem is mathematically interesting in its own right. 

1.1 Our models and results 

We follow the model developed in fl3l . Let V d = {—n, — (n — 1), ...,0, (n — l),n} d be a d- 
dimensional grid. Let A = {ai, &2, a m } be a set of moving agents on V d . At t = 0, the agents spread 
over the space according to some distribution V. Throughout this paper, we focus on the case where V is 
uniform. Agents move in discrete time steps. Every agent performs a symmetric random walk defined in 
the natural way. Specifically, at each time step an agent not at a boundary moves to one of its 2d neighbors, 
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each with probability l/(2d). If an agent is at a boundary, so there is no edge in one or more directions, we 
treat each missing edge as a self-loop. Let 2 m (t) 6 {0, 1} each be arandom variable, where 

represents whether the agent a,; is infected at time step t. We assume Hi(0) = 1 and Hj(0) = for all i ^ 1. 
The value will change from to 1 if at time t it is within distance 1 to another infected agent &j. (We 
use distance 1 instead of distance to avoid parity issues.) Once a value Hj-(£) becomes 1, it stays 1. 

Definition 1.1. (Information diffusion problem). Let A±, A2, ■ ■ ■ , A m G V d be the initial positions of 
the agents ai, . . . , a m and let S}(Ax), Sf (A2), . . . , S™(A m ) be m independent random walks starting at 
Ai, . . . , A m respectively, so that Sl(P) is the position of agent a^ at time t given that at t = its position 
was P 6 V d . The infectious state of each agent at time step t is a binary random variable such that 

• Hi(0) = 1, Hj(0) = for all other i, and 

• for all t > 0, = 1 if and only if 

(3i(t-l) = l) or f3j:3i(t-l) = lA S|(4) - S?'(A 



< 1 

1 

We define the finishing time of the diffusion process, or the diffusion time, as T = inf{t > : |{Hj(t) = 
1}| = m}. 

The following results for the diffusion time for 1 and 2 dimensional spaces are proved in [ 13 1. 

Theorem 1.2. Consider the information diffusion problem for d = 1, 2 dimensions, and assume the agents 
are initially uniformly distributed over V d . Then, with high probability, 

T = B(n 2 -m- 1/d ). (1) 

It is natural to ask whether Equation [T] also holds for d > 3. Our results show this is not the case. 

Theorem 1.3. (Diffusion time for d > 3) Consider the information diffusion problem for d > 3 with initially 
uniformly distributed agents over V d . Then there exists a constant c such that 

if cn d ~ 2 log 2 n < m < n d : T = <9(n d//2+1 • m -1 / 2 ) with high probability; 

if m < cn d ~ 2 log -2 n : T < @(n d /m) with high prob. and T > Q(n d /m) almost surely. 

Notice that Theorems 1 1 . 3 l and [L2l y ield the same result for d = 2, as well as when d = 1 and m = B(n). 
Here when we say with high probability, we mean the statement holds with probability 1 — n~ 7 for any 
constant 7 and suitably large n. When we say almost surely, we mean with probability 1 — o(l). When 
m > n d , the result is implicit in iflOl and the diffusion time in this case is 0(n). Finally, there are some 
technical challenges regarding the case cn d ~ 2 log -2 n < m < cn d ~ 2 log 2 n that we expect to address in a 
later version of this work. 

An interesting point of our result is that when the number of agents m is greater than n d ~ 2 , the finishing 
time is less than the mixing time of each individual random walk, and therefore the analysis requires tech- 
niques that do not directly utilize the mixing time. The rest of this paper focuses on deriving both the lower 
and upper bounds for this interesting case; the case where m < cn d ~ 2 log -2 n, which harnesses similar 
ideas and a mixing time argument, is only briefly described at the end. Finally, as previously mentioned, for 
space reasons we provide only the analysis for the three dimensional case, and note that the results can be 
generalized to higher dimensions. 

Theorem 11.31 can also be expressed in the terms of the density of agents. Let A = m/n d be the 
density. We can express the diffusion time as T = B(n/\/A) w.h.p. for cn~ 2 log 2 n < A < 1, whereas for 
A < cn~ 2 log -2 n we have T < B(l/A) w.h.p. and T > B(l/A) almost surely. 

We remark that all theorems/propositions/lemmas in this paper are assumed to hold for sufficiently 
large n, but for conciseness we may not restate this condition in every instance. 
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2 Preliminary results for random walks 

In this section we lay out some preliminary results on random walks that will be useful in the subsequent 
sections. These results focus on probabilistic estimates for the meeting time/position of multiple random 
walks. Along the way, we will also illustrate the limitations of some of these estimates, hence leading to the 
need of more sophisticated techniques in our subsequent analysis. For conciseness, all proofs in this section 
are left to Appendix |B] 

Let Z be the set of integers, and Z 3 be the set of integral lattice points in R 3 . For two points A, B G Z 3 , 

we write A — B as the 3-dimensional vector pointing from B to A. For a vector x G R 3 , denote the ith 

/ \ i/p 

coordinate of x as x\. Define the L p norm of a vector as [|a?[| p = ( /2i<3 \ x i\ p ) > an d a l so the infinite-norm 
in the standard manner \\x\\oo = niaxj<3 |^|. We moderately overload x in this paper, i.e. x is a scalar and 
x is a vector. 

Let S 1 and S 2 be two random walks in either V 3 (bounded walks) or Z 3 (unbounded walks). We say 
two walks S 1 and S 2 meet at time t if their Li-distance is within 1 at that time and two walks S 1 and S 2 
collide at t if they are exactly at the same position at time t. 

Definition 2.1 (Passage probability). Let S be a random walk in Z 3 starting at the origin O. Let B be a 
point with B — O = x ( which is a three dimensional vector). Define the probability that S is at B at time t 
as pit, x). Define the probability that S visits B within time t as q(t, x). 

We want to characterize the chance that two or more random walks in either V 3 or Z 3 meet. More 
specifically, consider the following question. Let A\, Aj, and B be j + 1 points over the 3-dimensional 
space Z 3 such that for alii G [j], the L\ distance between A{ and B is B||x > x. Let S 1 (Ai), S J (Aj), 
Si +l (B) be independent random walks that start with these points respectively. Our goal is to understand 
the probability that all the walks S , S 3 will meet or collide with the random walk S J+1 within x 2 time 
steps. We note that if the agents starting at B was stationary instead of following its own random walk then 
the analysis of the situation would be straightforward. In particular, the probability that all the walks would 
intersect B is @(l/x k ). This follows from standard results, including Theorem IA. 1 01 and Lemma lA.lll pro- 
vided in the appendices. We need to consider a more challenging situation when the agent starting at B is 
also moving. 

To begin, we shall consider the case where j = 1, so that we have just two moving agents. 

Definition 2.2. Let A and B be two points over Z 3 such that A — B = x. where \\x\\i is an even number. Let 
S 1 and S 2 be two independent unbounded random walks that start at A and B respectively. Define Q(t,x) 
as the probability that S 1 and S 2 collide before time t. 

We can use a simple coupling argument to relate Q(t, x) with q(t, x). The result is described as follows. 

Lemma 2.3. Let A and B be two points over Z 3 such that A — B = x, where \\x\\i is an even number. 
Consider Q(t,x) and q(t,x) defined above. We have Q(t,x) = q(2t,x). Furthermore, for t > \\x\\ 2 , 



Next, let us move to the case of j random walks in Z 3 , in which j > 1. 

Lemma 2.4. Let A\, A% . . . , Aj, and B be points in Z 3 such that \\Ai—B\\iare even and \ \Ai—B\\i > xfor 
all i < j. Let S 1 (A\ ),..., S J (Aj ), S J+1 (B) be j + 1 independent random walks that start at A\, . . . , Aj,B 
respectively. Let t = x 2 . Then the probability that all the walks S 1 , . . . , S 3 collide with S t+1 within time t 



Q(t,x) = e(i/\\x\\ 2 ). 



is at most ( ^ I , where £ is a sufficiently large constant. 
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Lemma 12.41 also helps us to analyze the scenario in which agents need to meet rather than to collide. 
This is summarized by the following corollary: 

Corollary 2.5. Let A\, A2, Aj, and B be points in Z 3 such that \\Ai — £?||i > x for all i < j. Let 
S 1 (Ai ),..., S-* (Aj ), S^ +1 (B) be j + 1 independent random walks that start at A\, Aj, B respectively. 

Let t = x 2 . Then the probability that all the walks S 1 , S-> meet with S^ +1 within time t is at most (j^f^ > 
where is a sufficiently large constant. 

We note that both Lemma [2~4l and Corollary 12. 5 1 are useful only when x is large enough. This forms a 
barrier for analysis of close agents in our model. But as we will see, we can get around this issue by looking 
at a coupled diffusion process that possesses a different diffusion rule specifically designed for handling 
close agents. 

Another important issue is the analysis on walks that are close to the boundary. For this, we show that 
the random walks will not behave significantly different (in terms of the desired bounds) when boundaries 
are added. We notice that similar results are presented in lH3l . but their results do not immediately translate 
to the building blocks we need here. The following is the major building block we need for our analysis: 

Lemma 2.6. Let A and B be two points in V 3 such that A — B = x and the distance between A and 
any boundary is at least 40||x||i. Consider two random walks S 1 (A) and S 2 (B) that start at A and B 
respectively. Let et be the event that S 1 (A) and S 2 (B) will meet before time t= \\x\\f and before either of 
them visits a boundary. Then Prjey^p] = fi(l/||a?||i). 

3 Lower bound 

Let us first state our lower bound result more precisely as follows. 

Theorem 3.1. Let ai, a m be placed uniformly at random on V 3 such that 1600nlog 2 n < m < n 3 . Let 
£2 = \/n 3 /m. For sufficiently large n, the diffusion time T satisfies the following inequality 

Pr [T < — log -29 n] < exp (— log n log log n) . 
81 

We use a local analysis to prove our lower bound. The key idea is that under uniform distribution of 
agents, the extent any particular infected agent can spread the virus within a small time increment is confined 
to a small neighborhood with high probability. By gluing together these local estimates, we can approximate 
the total diffusion time. 

To explain our local analysis, assume we start with an arbitrary infected agent, say ax- Let us also 
assume, for simplicity, that all the other uniformly distributed agents are uninfected. Consider the scenario 
within a small time increment, say At. During this time increment the agent ai infects whoever it meets in 
the small neighborhood that contains its extent of movement. The newly infected agents then continue to 
move and infect others. The size of the final region that contains all the infected agents at At then depends on 
the rate of transmission and the extent of movement of all of the infected agents. In particular, if At is small 
enough, the expected number of transmissions performed by ai is less than one; even if it infects another 
agent, the number of infections it causes within the same At is also less than one, and so on. The net effect 
is an eventual dying-out of this "branching process" (which we later model by what we call a diffusion tree), 
which localizes the positions of all infected agents at time At to a small neighborhood around the initial 
position of ai . 

As it may not be clear as we go through our proofs, we briefly review the main methodologies in 
obtaining lower bound results in related work, and point out their relation to our analysis and difficulties 
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in directly applying them to higher dimensions. (Some readers may wish to skip these next paragraphs all 
together; for others, who would like a more thorough discussion that unavoidably requires more technical 
details, we devote Appendix |E] to more details.) Two potential existing methods arise in JUfTOl and |[T3l . 
The former analyzes the growth rate of the size of the total infected region; an upper bound on this growth 
rate translates to a lower bound for the diffusion time. The latter work, focusing on d = 1,2, uses an 
"island diffusion rule", which essentially speeds up infection by allowing infections to occur immediately 
on connected components in an underlying graph where edges are based on the distance between agents. 
This approach avoids handling the issue of the meeting time of random walks when they are very close, a 
regime where asymptotic results such as Lemma 1231 and 1241 may not apply, while still providing a way to 
bound the diffusion time by arguing about the low probability of interaction among different "islands". 

The results in (2] [TOj are not directly applicable in our setting because the growth rate they obtain is 
linear in time, as a result of their assumption of constant agent density in an infinite space, in contrast to our 
use of a size parameter n that scales with the agent density. It is fairly simple to see that blindly applying a 
linear growth rate to our setting of o(l) density is too crude. On the other hand, analyzing how agent density 
affects the growth rate is a potentially feasible approach but certainly not straightforward. 

Our approach more closely follows [13 1. The main limitation of lTT3l . when applied to higher dimen- 
sion, is how to control the interaction among islands. If islands interact too often, because they are too 
close together, the argument, which is based on a low probability of interaction, breaks down. However, if 
one parametrizes islands to prevent such interaction, then the bound that can be obtained are too weak. In 
Appendix IE. II we provide further details arguing that for d > 2 this constraint ultimately limits the analysis 
for the case of o(l) density. We attempt to remedy the problem by using islands as an intermediate step to 
obtain local estimates of the influence of each initially infected agent over small periods of time. This anal- 
ysis involves looking at a branching process representing the spread of the infection, significantly extending 
the approach of fljj. 

3.1 Local diffusion problem 

This subsection focuses on the local analysis as discussed above. In Section 13.21 we will proceed to 
discuss how to utilize this analysis to get the lower bound in Theorem 13.11 As discussed in the last section, 
the two main difficulties in our analysis are: 1) our probabilistic estimates for the meeting time/position 
of multiple random walks are only useful asymptotically; 2) walks near the boundary introduce further 
analytical complication. To begin with, the following definition serves to handle the second issue: 

Definition 3.2 (Interior region). The interior region 2J(r) parameterized by r is the set of lattice points in 
V 3 that have at least L ^-distance r to the boundary. 

For any point P £ V 3 , define B(P, x) = {Q G V 3 : \\Q - P\\oo < x} as the x-ball of neighborhood of 
P under Loo-norm. The following proposition is our major result in this subsection. 

Proposition 3.3. Consider a diffusion following Definition 17.71 Let So be the initial position of the only 
infected agent &\ at time 0, and W be an arbitrary subset of lattice points in QJ(20^2 log n), where li = 
/m. Denote At = log -28 n. Define the binary random variable b(W) as follows: 

• If So G W: 6(W) is set as 1 if and only if all the infected agents at time At can be covered by the ball 
B(S Q ,M 2 logn). 

• If S £ W: b(W) = 1. 

We have 

Pr [b(W) = !]>!- exp(-5 log n log log n) (3) 
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The proposition yields that with high probability, all the infected agents lie within a neighborhood of 
distance O(i^) at time O (£?,). The variable £2 is chosen such that the expected number of infections spanned 
by an initially infected agent ai within 0(^|) units of time and a neighborhood of 0(^2) distance is 0(1). 
This can be seen by solving mfa/n) 3 x (1/lz) = 0(1), where m(£2 jn) 3 is the expected number of agents 
in a cube of size £2 x £2 x £2, and 0(1/ £2) is the meeting probability within time 0(£?t) between any 
pair of random walks with initial distance £2 (see Lemma [23T >. This choice of £2 appears to be the right 
threshold for our analysis. Indeed, a larger scale than £2 would induce a large number of infections made 
by ai, and also subsequent infections made by newly infected agents, with an exploding affected region as 
an end result. On the other hand, a smaller scale than £2 would degrade our lower bound. This is because 
the diffusion time is approximately of order 11/ £2, the number of spatial steps to cover V 3 , times £\, the time 
taken for each step, equaling 71^2 • Hence a decrease in £2 weakens the bouncQ. 

Secondly, we introduce W in Proposition 13.31 to avoid the case when So is close to the boundary. As 
we have mentioned, such boundary conditions often complicate random walk analysis. Although the impact 
of the boundary's presence has been addressed (e.g., HCE]), existing results are not fully satisfactory. For 
example, when two simple random walks S 1 and S 2 start near the boundary, only a lower bound for the 
probability that two walks meet within a specific number of time steps is available ( Ifl3l0 : we do not know 
of an upper bound counterpart. We arrange our proof so that it is sufficient to analyze the diffusion pattern 
of a virus when it starts far from the boundary. Finally, we note that no effort has been made to optimize the 
exponent 28 in At's definition. 

We briefly explain how our global lower bound can be readily obtained from Proposition 13.31 which is 
a strong characterization of the local growth rate of infection region size. Imagine the following evolution. 
Starting with a single infected agent, with high probability the infection spreads to a ball of radius at most 
9^2 log n in At time units. At this time point, the newly infected agents inside the ball continue to spread 
the virus to neighborhoods of size at most 9^2 log n, again with high probability. This gives an enlarged area 
of infection with radius at most 18^2 logn. Continuing in this way, the lower bound in Theorem [T3]is then 
the time for the infection to spread over V 3 . This observation will be made rigorous in the next subsection. 

The rest of this subsection is devoted to the proof of Proposition 13.31 It consists of two main steps. 
First, we need to estimate the expected number of infections done by a single initially infected agent within 
distance 9^2 log n and time increment At. Second, we iterate to consider each newly infected agent. The 
analysis requires the condition that the global configuration behaves "normally", a scenario that occurs with 
suitably high probability, as we show. We call this condition "good behavior", which is introduced through 
the several definitions below: 

Definition 3.4. (Island, II131D Let A = {ai, a m } be the set of agents in V 3 . For any positive integer 
7 > 0, let Of (7) be the graph with vertex set A such that there is an edge between two vertices if and only if 
the corresponding agents are within distance 7 (under L\-norm) at time t. The island with parameter 7 of 
an agent a, G A at time step t, denoted by Isd^a^, 7) is the connected component of 0^(7) containing &i. 

Definition 3.5 (Good behavior). Let £ x = nmT 1 ^. For 1 < i < (£ 2 /h) log -3 n, define Bi(P) = 

B (P, i£i log -1 n) and let dBi(P) = Bi(P) - Bi_i(P). For any P G V 3 define m^P) = (log5 gi[^' ( 3 p)|m . 
Let us define the following binary random variables: 

• Good density. Let {Dt : t > 0} be a sequence of 0,1 random variables such that Dt = 1 if and only 
if for all P G V 3 and all i < (li/£\) log -3 n, the number of agents in dBi(P) is at most rrii(P), for 
all time steps up to t. We say the diffusion process has the good density property at time t if Dt = 1. 

In the case of general d-dimensional space, £2 is chosen such that m(e 2 /n) d x (l/^~ 2 ) = 6(1), giving £2 = \J n d /m. 
Throughout the paper such d-dimensional analog can be carried out in similar fashion, but for ease of exposition we shall not bring 
up these generalizations and will focus on the 3-dimensional case. 
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• Small islands. Let {Et : t > 0} be a sequence of 0,1 random variables such that Et = 1 if and only 
if | Isd s (a,- , log - 1 n) | < 3 log n/or a// a,,- G A a«<i < s < t. We say that the diffusion process has 
the small islands property at time t if E t = 1. 

• Short travel distance. Let {Lt : t > 0} be a sequence ofO, 1 random variables such that Lt = 1 if 
and only if for all i G [m] and all t\ < t 2 < t with t 2 — t\ < £\ log -12 n, we have \\S\ — S\ \\\ < 
U 2 log -4 n. We say the process has the short travel distance property at time t if L t = 1. 

Finally, let Gt = E>t x Et x Lt, and say the diffusion process behaves well at time t if Gt = 1. We also 
focus ont< n 2 5 and define the random variable G = G n 2.5. 

The value n 2 5 in the definition is chosen such that it lies well beyond our lower bound for the case 
m < n 3 , but is small enough for our forthcoming union bound. By using properties of random walks and 
techniques derived in [131, we have 

Lemma 3.6. Let A = {ai, a m } be agents that are distributed uniformly in V 3 at t = 0. For sufficiently 
large n, we have Pr [G = 1] > 1 — exp(— 6 log n log log n). 

The proof of Lemma [3^6] is presented in Appendix [C] With this global "good behavior", we have the 
following estimate: 

Lemma 3.7. Let A = {ai, . . . , a m } be agents that are distributed in V 3 in such a way that Dq = 1. 
Let S 1 ^ 2 ,. . . , S m be their corresponding random walks. Consider an arbitrary agent &j with Sq E 
23(2^2 log -4 n). Let {a^ , . . . , a*. } be the set of agents outside B\ (Sq) at time 0. Define Xj i as the in- 
dicator random variable that represents whether the agents a,- and &i e meet within time [0, At]. We have 



E 



E 

Kk 



X 



D = l,S 3 eW(2£ 2 log- 4 n) 



< log 3 n. 



Proof. First, notice that the number of lattice points in dBi(P) satisfies 

\dBi\ = \Bi\ - \Bi-i\ < (2^i log -1 n) 3 - (2(i - ljhlog- 1 nf < 24i 2 ^f log" 3 n. 



We may also similarly show that 



\dBi\ > i 2 £llog- 3 n. 



Let q = (£ 2 /£x) log' 3 n. For each % G [q], write B { = B(S J ), dB { = dBi{S 3 ), and 



77?; 



mi 



{Si). We 



want to estimate the meeting probability and hence the expected number of infections for each i G [q\. 

First, let us consider the agents outside the ball B q . The probability that any specific agent initially 
outside B q ever travels into the ball B(5q, £ 2 log -4 n) within time £\ log -12 n is at most exp(— fi(log 3 n)) ( 
by, e.g., Lemma lATBI in the section on probability review). On the other hand, the probability that ever 



travels out of B(Sq , £ 2 log - n) is also exp(— J) (log n)). For these two agents to meet, at least one of these 
events has to occur. Therefore, with probability exp(— 0(log 3 n)) will meet an agent initially outside B q . 
This leads to 



E 



E 

. c k' 



X 



j,k' 



Do = l s Sg€9J(%log- 



77 



< mexp(— 0(log 3 n)). 



the set of agents initially 
outside B a 



1 



Let us next focus on agents inside B q . Fix an arbitrary G dBi. Let e 1 and e represents the 
events that S- 7 and S l£ ever visit a boundary before time l\ log -12 n respectively. Again by Lemma IA31 

Pr[e* V e e \D = 1, Sg € 23(2^ 2 log" 4 n)] = exp(-fi(log 3 n)). We now have 

E[X^\D Q = l,Sg e VB(2£ 2 log" 4 n)] 
= Pr[X jtt = 1; V A V|A) = 1, Sg € 2J(2^ 2 log" 4 n)] + Pr[X,-^ = 1; e j V e £ |A) = 1, Sg G 23(2£ 2 log -4 n)] 
< Pr[X jj£ = 1; V A V|A) = 1, Sg € 2J(2^ 2 log" 4 n)} + exp(-0(log 3 n)). 

To compute 



Pt[X jtt ) -e 3 A -.e'|£>o = 1, -Sg € »(2^ log -4 n) 



Pr 



*2 

< ; — 12 



log iz n 



l^-^UxlljAH^AV) 



D = l,5g G 23(2£ 2 log" 4 n) 



we couple S J and S** with unbounded walks S J and S te starting at the same positions at t = in the natural 
way. Before the pair of bounded walks visit the boundary, they coincide with their unbounded counterparts. 
Therefore, we have 



Pr 



< Pr 



D = l,5 i 5eQJ(2£ 2 log- 4 n) 



3t < 12 



g n 



SL-S*?||i<l 



D = I, Si e 23(2£ 2 log" 4 n) 



0( - ) (Corollary|23) 

(i - l)£i 



We thus have E[Xj^|Do = 1, Sg € 2I(2£ 2 log 4 n)] < nzffig[ for some constant Cq. Next, we estimate 



X j4 I D = 1, Sg € 23(2^2 log -4 n)] as 



(» - 1)4 (i - l)£i 



C |a^|mlog 5 n _ C (3i 2 ^log- 3 n)ralog n 6Co«m^log n 



8n 3 



(i - l)4n 3 



The first inequality holds because Dq = 1 and rrtj is an upper bound for the number of agents in dBi (for all 
i). 



E[Y^ X j>e | Do = 1, Sg € 2J(2^ 2 log" 4 n)] < ^ Ei + m x exp(-fi(log 3 n)) 



< 



E 



upper bound for those outside B q 

6Com£l log 2 n 



+ exp(— H(log n)) 



K 2<i<q 

6C q 2 m£l log 2 n 2 

< ^ hexp(-S2(log n)) 

= 6Colog" 4 n + exp(— f2(log 2 n)) 

< log" 3 n 



for sufficiently large n as m < n 3 



□ 



Lemma [3^71 says that if the initial distribution of agents possesses good behavior, then one can ensure 
that the expected number of direct infections on far-away agents is small. For agents close to the initially 



infected agents, we instead utilize the concept of islands, which is also deeply related to the subsequent virus 
spreading behavior. Now we formally introduce a new diffusion process with a modified "island diffusion" 
rule. It is easy to see that this new diffusion process can be naturally coupled with the original diffusion 
process (evolving with Definition ll.il ) by using the same random walks in the same probability space. 

Definition 3.8 (Diffusion process with island diffusion rule). Consider a diffusion process in which m 
agents are performing random walks on V 3 . An uninfected agent a,- becomes infected at time t if one of the 
following conditions holds: 

1. it meets a previously infected agent at time t. For convenience, we say a,- is directly infected if it is 
infected in this way. 

2. it is inside Isdt(&i,£i log -1 n) where &i is directly infected at time t. 

This coupled process is different from the diffusion models introduced in |[T3l[T6l[T2l . In our formula- 
tion, an island is infected only if meeting occurs between one uninfected and one previously infected agent. 
In lfT3l[T6l[T2l (using our notations), an island is infected once it contains a previously infected agent. As a 
result, infections occur less frequently in our model than the models in lfl3l[T6l[T2"ll . This difference is the 
key to getting a tight lower bound for dimensions higher than 2. More precisely, our infection rule allows us 
to build a terminating branching process, or what we call "diffusion tree" in the following definition, whose 
generations are defined via the infection paths from the source. The termination of this branching process 
constrains the region of infection to a small neighborhood around the source with a probability of larger 
order than obtained in Ifl3l . This in turn leads to a tighter global lower bound. 

Definition 3.9 (Diffusion tree). Let W C 03(2^2 logn) be a subset of lattice points. Consider a diffusion, 
following the island diffusion rule, that starts with an initially infected island Isdo(ai,£i log -1 n). Recall 
that Sq denotes ai 's position att = 0. The diffusion tree Tr with respect to W has the following components: 

1. If Si £ W, Tr = 0. 

2. If Si G W, 

• The root of Tr is a dummy node r. 

• The children of r are all the agents in Isdo(ai , t\ log -1 n). 

• &£' is a child of&£ (agi G child(a^)) if&£> is infected by a^ before time At. 

• &i> is a direct child of ai (a^ G dchild(a^)) if &gi G child(a^) and it is directly infected by ae. 

For technical reasons, ifap is not in Tr, we let child(a£) = and dchild(a^) = 0. 

Figure Q] in the Appendix shows an example of the diffusion process and its corresponding diffusion 
tree at t = 0, 20, 40, 60. Notice that the diffusion tree Tr stops growing after At steps. 

We refer the root of the tree as the 0th level of the tree and count levels in the standard way. The height 
of the tree is the number of levels in the tree. Note that diffusion tree defined in this way can readily be 
interpreted as a branching process (See, e.g., Chapter in [ 19]), in which the jth generation of the process 
corresponds with the jth level nodes in Tr. 

Next we incorporate the good behavior variable Gt with diffusion tree. The motivation is that, roughly 
speaking, consistently good behavior guarantees a small number of infections, or creation of children, at 
each level. This can be seen through Lemma 13/71 

Definition 3.10 (Stopped diffusion tree). Consider a diffusion process with island diffusion rule, and let 
T(£) be the time that a^ becomes infected in the process. The stopped diffusion tree Tr' (with respect to 
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&i and W) is a subtree of Ti induced by the set of vertices {ae : &e G Tr A G^rn = 1}. We write 
ae G child'(a^/) if ae G child(a£/) and &e G Tr'. Similarly, ae G dchild'(a^) if ae G dchild(a^/) and 
a e G Tr'. 

Note that the definition of stopped diffusion tree involves global behavior of the whole diffusion process 
due to the introduction of Gt- On the other hand, Tr = Tr' with overwhelming probability, so we can 
translate the properties of Tr' back to Tr easily. 

We next show two properties of the (stopped) diffusion trees, one on the physical propagation of chil- 
dren relative to their parents and one on the tree height. These are our main ingredients for proving Propo- 
sition [33] The properties are in brief: 



1. If ae is a child of a^/ in the stopped diffusion tree Tr', ||S^^ — S^rm ||oo is O(^)- 

2. The height of the stopped diffusion tree Tr' is 0(1) with high probability. 

Proving the first item requires the following notion: 

Definition 3.11 (Generation distance). Consider the diffusion tree Tr with respect to W. Let ae be an 

arbitrary agent and let a^ be its parent on Tr. The generation distance of ae with respect to Tr is 

{\\SL,^ — SfL, e ,^ Ih if ae is in Tr and is at the 2nd or deeper level 
otherwise. 

In other words, the generation distance between ae and ae> is the distance between where ae and ae> were 
infected. The generation distance of ae with respect to Tr' is which is set to be X>e if a i is in Tr' and 
otherwise. 

With this notion, we can derive the following lemma: 

Lemma 3.12. Consider the stopped diffusion tree with respect to W that starts with an infected island 
Isdo(aj, i\ log -1 n). For an agent ae in Tr', < 4^2- 

Proof. We focus on the non-trivial case that Sq G W and that ae is at the 2nd level of Tr' or deeper. Suppose 
that the diffusion process behaves well up to time T{t) i.e. Gtu) = 1- Let a^ be the parent of ae on Tr'. 
By the construction of Tr', there exists an ae" G dchild'(a^) (possibly ae itself) such that 

• ae G Isdr(£") (a^/ , i\ log -1 n). 

• T(£) = T(l") i.e., ae and a^" get infected at the same time due to the island diffusion rule. 

• ||<Sj^) — ^T(t) Ik — 1 i- e -> a ^" § ets infected because it meets an infected agent. 

By the triangle inequality, 

W — ii ci' at || <-- || of q<?" || , || c£" at \\ 

u e — \\ D T(e>) t(£)\\i — \\ D T(e') d t(£")IU ~t~ \\ d t(£") d t«)IIi- 
Note that || Sj,, t * — S^/n \\i < 3^2 log -4 n+ 1 < £2 (short travel distance property) and || S^wn — \\i < 
[l\ log -1 n)(31ogn) = 3£i < 3^2 (small island property) since GW) = 1. Finally, the case when Gi>ie) = 
is trivial, and the lemma follows. □ 

Next we show that with high probability the height of the stopped diffusion tree is O(l). Using standard 
notation, we let {F}t>o be the cr-algebra, or filtration, generated up to time t, i.e., Ft encodes all the 
information regarding the diffusion process up to t. The special instance Fq is used to describe the initial 
positions of the agents. 

The main property of stopped diffusion tree that we need is the following: 
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Lemma 3.13. Consider a diffusion process with the island diffusion rule. Let &e be an arbitrary agent with 
infection time T(£). We have 



E 



dchild'(a*)| T m ,S^ {£) G W(2£ 2 log~ A n 



< log 3 n, 



(5) 



where dchild'(-) is defined for a stopped diffusion tree with respect to an arbitrary set W C 03(20^2 log n). 

We regard the conditional expectation in Equation [5] as a random variable. The interpretation is that 
the expected number of a^'s direct children is less than log -3 n, regardless of the global configuration at the 
infection time of &e, as long as it lies in 03(2^2 log -4 n) at that time. 

Proof. We focus on the case when Sq G W; otherwise Tr' is empty and the lemma trivially holds. First 
observe that all &j G lsd T ^(S^ £ y l\ log -1 n) are infected at or before the time a^ is infected. Therefore 
they cannot be direct children of &£ by Definition 13.91 and 13.101 On the other hand, an agent &j is outside 
lsd T ^(S^^,£i log -1 n) only if it is outside the ball M{Sf r ^y£\ log -1 n). Hence dchild'(a^) is bounded 
by the number of agents initially outside M(S^ £ y £\ log -1 n) that meet before time At. We consider two 
cases: 



Case 1. D 



T(£) 



1. By Lemma 13/71 we have 



E 



|dchild'(£ 



l,S e m G 2J(2£ 2 log- 4 n) 



< log n. 



Case 2. D T ^\ = 0. By Definition 13 .101 we have 



E 



|dchild'(a<) 



0,S e T{e) G QJ(2^ 2 log- 4 n) 



< log 3 n. 



Therefore, 



E 



dchild'(a<?)| T T(e) ,SL e) € 23(2f 2 log~ 4 n 



• lO! 



T(l) 

-3 „ 



E 



|dchild'(a<)| J- rW ,4 M G V3(2£ 2 log- 4 n),D T 



(0 



□ 

Recursive utilization of Lemma l3.13l on successive tree levels leads to the following lemma: 

Lemma 3.14. Consider a diffusion process with the island diffusion rule starting with an infected is- 
land Isdo(ai,^i log -1 n). For the stopped diffusion tree Tr' with respect to any W C 23(20^2 logn), let 
Height (Tr') be its height. Then we have 

Pr [Height (Tr') > 2 logn] < exp(— 3 logn log logn). (6) 

Let us denote the set of agents at the kth level as It is worth pointing out that, despite a similar 
analysis to that of standard branching process, there is a technical complication on the conditioning argument 
since the creation of each child within the same level can be performed at different times in the diffusion 
process. This implies that there is no single filtration that we can condition on each level to analyze the 
expected size of the next one. Nevertheless, conditioning can be tailored to each agent at the same level. 
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Proof. We focus on the case when Tr' is non-trivial i.e. Sg G W. Let 1(^4) = 1 if A occurs and otherwise. 
We have, for any k < 2 log n, 



E E NT(?](af^i log" 1 n)|J(G r(0 = 1) 

a^eF' fe a £ /Sdchild'(a^) 



< (31ogn)E 



^ |dchild'(a^)| 



•Fo.So G W 



The equality holds by the stopping rule and the inequality holds by the small islands property. Next we have 



E 



E 



|dchild'(a*)| 

a^GF' 



I fa € F' fe )|dchild'(a^)| 



Fo, So e W 



te[m] 

= E E 

f€ [m] 

- E E 

f£[m] 



J(a* £ F^)|dchild'(a £ )| 



E 



/(a, G F' fc )|dchild'(a^)| 



/(a, G Fj^E 



|dchild'(a^)| 



Ft®, SltW 



F\,Sl G W 



(Because /(a^ G FJ.) is Ft(£) -measurable) 



Note that SjsWC 9J(20^ 2 logn) implies 5^ G (20£ 2 logn - 4k£ 2 ) C 2J(2£ 2 log -4 n) if G r(€) = 1, 
by using Lemma [3. 121 Therefore, by Lemma 13.131 



E 



|dchild'(a£)| 



^T{t)iS Q G W,G T (£) = 1 



On the other hand, 



E 



T(i) 



|dchild (a^)| 



E 



|ddiild'(a*)| 



T m ,S l m G QJ(2£ 2 log- 4 n),G TW = 1 



< log 3 n 



Ft(£),S G W, G T (i) = 



by the stopping rule. This leads to 



E 



|dchild'(a £ )| 



T m ,SleW 



< log n 



which implies 



ee[m] 



I(a* G F' fc )E 



|dchild'(a£) | 



F m ,SleW 



^log^nEOF^IlJo,^ 1 G W] 
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Therefore, 



E[|F' fc+1 | | T Q ,S^ e W] < 31og- 2 nE[|F' fc | | T Q ,S% G W] < (3 log" 2 n) fc E[|F;| | T Q ,S% G W] < logn(31og 



We now prove Proposition 13.31 

Proof of Proposition \3. 3\ First note that the set of infected agents in a diffusion process with island diffusion 
rule, namely Definition 13.81 is always a superset of the coupled original diffusion process using Definition 
11.11 at any time from to At. Next we have 



Pr [Height (TV) > 21ogra] 
= Pr [(Height (Tr) > 21ogn) A (Height(Tr) = Height(Tr'))] 

+ Pr [(Height (Tr) > 21ogn) A (Height(Tr) / Height (Tr'))] 

< Pr [Height (Tr') > 2 log n] + Pr[Tr' ^ Tr] 

< exp(— 3 log n log log n) + Pr[G = 1] 

< 2exp(— 3 log re log log n) 



We will show that the viruses can be covered by the ball B(5q , 9^2 log n) when 

(Height(Tr) < 21ogn) A (G = 1) A (Sg G W). 

Fix arbitrary infected G F& with k < 21ogn . By Lemma |3.12[ we have \\Sq — S^/^\\i < 8^2 log n. 

Moreover, G = 1 implies that for all < t' < At, \\Sj,,^ — Sf, \\i < 3^2 log -4 n < £2 log n. This suggests 

II 5^ - S^lloo < U 2 log n for all t' G [0, At]. Therefore, the virus does not escape the ball B(Sq, 9^2 log n) 
within time [0, At]. □ 

3.2 From local to global process 

This section will be devoted to proving Theorem 13. II via Proposition 13.31 or in other words, to turn our 
local probabilistic bound into a global result on the diffusion time. 

We note that Proposition 13.31 deals with the case when there is only one initially infected agent. As 
discussed briefly in the discussion following the proposition, we want to iterate this estimate so that at every 
time increment At, the infected region is constrained within a certain radius from the initial positions of all 
the agents that are already infected at the start of the increment. Our argument is aided by noting which 
agents infect other agents. To ease the notation for this purpose, we introduce an artificial concept of virus 
type, denoted by i/j^. We say an agent gets a virus of type u^t if the meeting events of this agent can be 
traced upstream to the agent aj, where aj is already infected at time t. In other words, assume that 84 is 
infected at time t, and imagine that we remove the viruses in all infected agents except aj but we keep the 
same dynamics of all the random walks. We say a particular agent gets v^t if it eventually gets infected 



and hence 




> 0] < E[|F' 21ogn |] < exp(-31ognloglogra). 



□ 



Therefore, we have 



Pr [(Height (Tr) < 21ogn) A {G = 1)] > 1 - 3exp(-31ognloglogn). 
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under this imaginary scenario. Note that under this artificial framework of virus types it is obvious that an 
agent can get many different types of virus, in terms of both i and t. 

In parallel to Proposition 13.31 we introduce the family of binary random variables 6^ to represent 
whether a virus of type v ^ can be constrained in a ball with radius 9^2 log n: 

Definition 3.15 (bij and virus of type u^t). Let 03 = B(P, f ) where P = (n/2, n/2, n/2). Let ai, a m 
be agents that are uniformly distributed onV 3 at t = and diffuse according to Definition li.il Let t be an 
arbitrary time step and i 6 [m]. At time t, a virus of type vn emerges on agent a, and diffuses. Define the 
binary random variable b^t as follows: 

• If 6 53: bi t t is set as 1 if and only if all the agents infected by the virus of type Vi t t at time t + At 
can be covered by the ball B(S'|, 9^2 log n). 

• IfS*g»: b ht = l. 

Let us start with showing bij = 1 for all i and t with high probability: 
Corollary 3.16. Consider the family of random variables {b^t : i £ [m],t < n 25 } defined above. We have 



Pr 



A (Kt = i) 

i£[m],t<n 2 - 5 



> 1 — exp(— 4 log n log log n). 



Proof. We first bound Pr[6j t = 1] for any specific i and t. Since the agents are placed according to 
stationary distribution at t = 0, each agent is still distributed uniformly at time t. Next, at time t, we may 
relabel the agents so that a, is regarded as the single initially infected agent in Proposition 13.31 where VV is 
set as 03. We therefore have Pr[6j )t = 1] > 1 — exp(5 log n log log n). 

Next, we may apply a union bound across all i and t to get the desired result. □ 

Lemma 3.17. Let 03 = B(P, n/8). Let Bt be the indicator variable that there is at least one agent in 03 
at time t. Let B = n^n 2 - 5 Bt, the indicator variable that there is at least one agent in 03 at all times in 
[0,n 2 - 5 ]. We have 

Pr[B = 0] < exp(- log 2 ra) 

for sufficiently large n. 

Proof. First, notice that for any specific t, the expected number of agents in 03 is Q(rn). Therefore, by 
Chernoff bound (using the version in Theorem lA.lt Pr[B t = 0] < exp(— fi(m)) < exp(— log 3 n). Next, 
by a union bound, we have Pr[£? = 0] < n 2 ' 5 exp(— log 3 n) < exp(— log 2 n). □ 

We next present our major lemma for this subsection. 

Lemma 3.18. Let ai,...,a m be placed uniformly at random on V 3 such that m > 1600nlog 2 n. Let 
£2 = \Jrfi jm. Let {bij '■ i S [m], t < n 2 ' 5 } and B be the random variables described above. If bit = l/ or 
all i, t and B = 1, then the diffusion time is at least T c = ^£2 n log 29 n. 

Notice that by Proposition 13.31 and Lemma |3~T71 



Pr 



l\i<m,t<n^ (ht = 1) > 1 -exp(-41ognloglogn). 



Pr[B = 1] > 1 - exp(- log 2 n). 
Together with Lemma l3.18l Theorem 13. ll then follows. 
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Proof. Without loss of generality, we assume the x, y, and z coordinates of S\ are all negative. We can 
always rotate the space V 3 at t = correspondingly to ensure this assumption to hold. 

We shall prove by contradiction. Consider two balls 23 and 23 defined above. Assume the diffusion 
time is less than T c . First, because B = 1, a necessary condition for the diffusion to complete is that an 
infected agent ever visits the smaller ball 23 at a time T' < T c (since otherwise the agents in 23 would be 
uninfected all the time, including at T c ). We call this agent Next, for the infection to get into 23, it 
must happen that there is an infected agent that enters 23 from outside, whose infection trajectory eventually 
reaches aj/. We denote T" to be the last time that this happens, and the responsible agent to be aj». We 
focus on the trajectory of infection that goes from aj» to a^ that lies completely inside 23 (which exists since 
T" is the last time of entry). Note that we consider at most \T C /At~\ time increments of At. Now, since 
bi^ = 1 for all i and t, by repeated use of triangle inequality, we get 

/ (1/81)^™ log -29 n 



I Q l 
I 



qt I 



< 



logn 



" T 

± c 






< 9£ 2 logn ^ 


At 





:iog 



-28 



+ 1 



/Z 



n „ , n 
< -+ 9£ 2 logn < - 
y o 



On the other hand, the physical dimensions of 23 and 23 give that 



I q 1 c 



T" 1 1 oo 



> 



n 



which gives a contradiction. 

4 Upper bound 

We now focus on an upper bound for the diffusion time. Our main result is the following: 



□ 



Theorem 4.1. Let a\, 



a m be placed uniformly at random on V 3 , where n < m < n 3 . Let 



y/n 3 /m ■ log n. When n is sufficiently large, the diffusion time T satisfies 



Pr[T > 128ni 2 log 47 n] < exp(-- log 2 n). 



Note that this theorem shows that an upper bound of 0{n^/n? /m) holds for the diffusion time with 
high probability. Hence the upper and lower bounds "match" up to logarithmic factors. We remark that the 
constant 47 in the exponent has not been optimized. 

The main goal of this section is to prove this theorem. Our proof strategy relies on calculating the 
growth rate of the total infected agents evolving over time; such growth rate turns out to be best characterized 
as the increase/decrease in infected/uninfected agents relative to the size of the corresponding population. 
More precisely, we show that for a well-chosen time increment, either the number of infected agents doubles 
or the number of uninfected agents reduces by half with high probability. The choice of time increment 
is complex, depending on the analysis of the local interactions in small cubes and the global geometric 
arrangements of these cubes with respect to the distribution of infected agents. 

As with the lower bound proof, our technique for proving Theorem 14. H is different from existing meth- 
ods. Let us briefly describe them and explain the challenges in extending to higher dimensional cases; 
further details are left to Appendix IE.2I Roughly, existing methods can be decomposed into two steps (see 
for example |[T3l ): 1) In the first step, consider a small ball of length r that contains the initially infected 
agent. One can see that for d = 2, when number of agents in the ball is @(m(r /n) 2 ), within time increment 
r 2 the number of infections to agents initially in this ball is S7(l) w.h.p.. 2) The 2nd step is to prove that for 
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any ball that has 0(1) infected agents at time t, its surrounding adjacent balls will also have 0(1) infected 
agents by time t + r 2 . From these two steps, one can recursively estimate the time to spread infection across 
the whole space V 2 to ben/r x r 2 = nr w.h.p.. In other words, at time nr all the balls in V 2 will have 0(1) 
infected agents. Moreover, every agent in V 2 is infected in the same order of time units, because 0(1) is 
also the total number of agents in any ball under good density condition. Finally, it is then clear that a good 
choice of r is then n / ^pm, which would give the optimal upper bound. 

The critical difference in the analysis for d > 2 lies primarily in the magnitude of the meeting prob- 
ability of random walks. In the case of d = 2, the meeting probability of two random walks at distance r 
within time r 2 is 0(1), whereas for d > 2 the meeting probability is 0(l/r d ~ 2 ). For d = 2, this means 
that it is easy, i.e. w.h.p., for infection to transmit from a ball with 0(1) infected agents to an adjacent 
uninfected ball, so that the latter also has 0(1) infected agents after a time increment of r 2 . In the case 
d > 2, however, Q(r d ~ 2 ) infected agents must be present in a ball to transmit virus effectively to its adja- 
cent uninfected ball within r 2 time. Consequently, arguing for transmission across adjacent balls becomes 
problematic (more details are in Appendix IE.2l i. In light of this, we take an alternate approach to analyze 
both the local interactions and the global distribution of infected agents. Instead of focusing on transmission 
from one infected ball to another, we calculate the spreading rate across the whole space. This turns out to 
be fruitful in obtaining a tight upper bound. 

We briefly describe the forthcoming analysis. As with the lower bound, we start with local analysis. 
We partition the space V 3 into disjoint subcubes each of size ^x^x £2- Here £2 is just a logarithmic factor 
larger than £2, the size of subcubes used for the lower bound, so that with overwhelming probability there 
are at least £2 agents in a subcube. We show that, within every subcube, over a time increment of length 
G>(£ 2 ) tne number of infections is roughly a 0( Infraction of the minimum of the number of infected and 
uninfected agents. Hence, at least locally, we have the desired behavior described above. 

We then leverage the local analysis to obtain the global result. However, this is not straightforward. 
For example, consider the beginning when the number of infected agents is small. If infected agents are 
distributed uniformly throughout the whole space, it would be easy to show that new infections would 
roughly grow in proportion to the number of infected agents. However, if infected agents are concentrated 
into a small number of subcubes, we have to show that there are enough neighboring subcubes on the 
boundary of these infected subcubes that these subcubes become infected suitably rapidly, so that after the 
appropriate time increment the number of infected agents doubles. Similar arguments arise for the case 
when infected agents are dominant, with the end result being a halving of the uninfected population. 

We now make the above discussion rigorous. First, let b = (2n+l)/£o,, so there are in total 6 3 subcubes. 
As in the previous section, we divide the time into small intervals. We reuse the symbol At to represent the 
length of each interval but here we set At = 16£ 2 . Our local bound is built within each subcube (and pair 
of neighboring subcubes) in the time increment At: 

Lemma 4.2. Let W C V 3 be a region that can be covered by a ball of radius 2^2 under the Lao-norm. Let 
A' and A u be subsets of infected and uninfected agents in W at time t such that \A'\ = mi, \A U \ = m-i, 
and maxjmi, 7712} = £%j log 2 n. Given any initial placement of the agents of A? and A u , let M(t) be the 
number of agents in A u that become infected at time t + At. We have 



Pr 



. . in min|mi , mo} 
M(t) > — — i 1 4 ' J 



> t log n. 



log n 

for some constant To, where J~t denotes the information of the whole diffusion process up to time t. 

Proof. The high level idea of our proof is to count the total number of times the infected agents meet the 
uninfected ones between time t and t + At. The probability two agents in W can meet each other within time 
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At is approximately 0(1/^) (Lemma 1231 ) . The expected number of meetings is thus ^(1/^) x m\m2 = 
S7(min{mi, rnz}). The total number of newly infected agents is the number of meetings modulo possible 
overcounts on each originally uninfected agent. If we can show that the number of meetings is 0(1) for 
each uninfected agent, then we can conclude that 0(min{mi, m,2}) more agents become infected at time 
t + At. 

Two problems need to be addressed to implement this idea. First, when the agents are close to the 
boundary, they may behave in a more complicated way than suggested by Lemma 1231 Second, in the (rare) 
case that an uninfected agent is surrounded by a large number of infected agents, it can possibly meet with 
w(l) infected agents, making it difficult to give an upper bound over the number of overcounts. 

To address both problems, we wait £\ time steps before starting our analysis on infections. This time 
gap is enough to guarantee that with constant probability, the agents are "locally shuffled" so that by time 

t + i 2 2 , 

1 . all agents are reasonably far away from the boundaries, 

2. the distance between any pair of agents is "appropriate" (in our case, the distance is between £2 and 

9l 2 ). 

Intuitively, the "local shuffling" works because central limit theorem implies that the agents' distribu- 
tion at the end of these steps is approximately multivariate Gaussian. 

We now implement this idea. First, we couple the (sub)process in W with one that has slower diffusion 
rule. In the coupled process, we first wait for £\ time steps, in which no agent becomes infected even if it 
meets an infected agent. After these l\ steps, for an arbitrary a« E and &j E A u , let X^j = 1 if both of 
the following conditions hold, 

• the Li-distance between a, and &j is between £2 and 9^2- 

• the Li-distance between a, and any boundary is at least 360^2- 

By Corollary IA. 131 Pi[Xij = 1] > r for some constant r. Therefore, E[^ a . gj4 / a . e A u > Tm\m2- 
On the other hand, ^ . X^j < m\m2- It follows easily that we have Pr[^ j > \Tm\m2\ > r/2. 
Our slower diffusion rule then allows a» E A* to transmit its virus to a,j E A u if and only if 

• *ij = 1. 

• they meet during the time interval (t + i| , t + At] , 

• aj and &j have not visited any boundary after t+£\ before they meet. In other words, an agent aj E A^ 
(&j E A u resp.) loses its ability to transmit (receive resp.) the virus when it hits a boundary after the 
initial waiting stage. 

An added rule is that agents in A u will not have the ability to transmit the virus even after they are 
infected. 

Let Yij be the indicator random variable that is set to 1 if and only if a» transmits its virus to &j under 
the slower diffusion rule. By Lemma |2T61 we have Pr[Yj j = 1 | Xi j = 1] = ^(l/i^)- Therefore, we have 

Pr[y is j = 1] > Pr[^j = 1 I X hj = 1] PT[Xi d = 1] = n(i/i 2 ). 

Hence, 

E[ Yij] = SX{m\m2l t$>T\m\m2l l-i 

for some constant t\. ^) a eA" ^ s approximately the number of newly infected agents except that 
the same agent in A u may be counted multiple times. Our next task is thus to give an upper bound on the 
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number of overcounts. Specifically, we fix an agent &j E A u and argue that the probability J2 a -eAf — 
log 2 n is small. 

In our slower diffusion model, once an agent in A* reaches the boundary, it is not able to transmit the 
virus further. We need to bound the probability that there are more than log 2 n agents in A? that transmit 
the virus to &j before they hit any boundary. This probability is at most the probability that more than log 2 n 
infected agents performing unbounded random walks meet aj, where each infected agent is at least £2 away 
from cij initially. 

By Lemma [2741 there exists a constant Co such that for all possible values of Xi j and sufficiently large 

n: 



r> r "v ^ 1 2 1 v v v 1 ^ fJ2i< mi x i,j\ f c log 2 n\ 
Pr[ 2^ Yi tj > log n\X 1>j ,X 2 , j ,...,X mi>j ] < - 

V log n y V £2 J 



log 2 n 



\ / 1 2 \ log 2 " 

mi \ / cq log n x 



' log 2 nj V l 2 

/ \ log 2 n / 1 2 \ log 2 n 

/ emi \ 6 / cq log n N 
Vlog 2 n) \ £2 

(\ log 2 n 
£2 ) 

< exp ( — log 2 n log log n) . 



Therefore, 



Pr[ Yl y M ^ ] °g 2n ] = E P r [ J2 y « > l °E 2 n I Iij,..,^,,]] < exp(-log 2 nloglogn). 

a^eA^ a»eA-f 

By a union bound we have 

Pr[3j : ^2 — log 2 n] < m ■ exp (— log 2 n log log n) < exp(— 2 log 2 n). (7) 

Next, let us fix 6 A$ and we may argue in a similar way to obtain 

Pr[3f : ^ Y id > log 2 n] < exp(-2 log 2 n). 

Define e t as the event that ^ aj eA u ^i,3 — l°g 2n ) A (Vj, X^eA/ — log 2 ™)- Therefore, 
Pr[et] > 1 — 2 exp(— 2 log 2 n). Observe that e t implies ^ ■ 5^- < min{mi, ^2} log 2 n. We have 

nmim 2 /4 < E[^Fij] 

S3 

= E[^y^|e t ]Pr[e t ] + E[^y^he t ]Prhe t ] 

< E[^Yyet]+m 2 Prhe*] 

< Ef^^jlej] +2m 2 exp(-21og 2 n). 
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Therefore, 



E E y «M > nm 1 m 2 /i2 ~ 2m 2 exp(-21og 2 n) 

Tl mi 777-2 



hJ 



> 



> 



2t 2 

n maxjmi, 7772} min{777i, 777,2} 
2i 2 

Ti min{777i, 7772} 



2 log n 

Next, define indicator variable Ij = 1 if and only if ^2 a . e ^f Yi,j > 0. The sum J2j Ij * s tne total 
number of newly infected agents in our weaker process and thus is a lower bound on M(t). Note that if e t 

holds, _ 

^ Ij < E Y h3 - min{ 777,1, 777,2} log 2 n, 

aj-eA" i,j 

and hence _^ 

E[ ^2 Ij\ e t] < min{ 777,1, 777,2} log 2 77. 



On the other hand, 



E(£l ]M > lo^nE^M > Tll ^ { ":"" 121 
^— ' *-r! 2 log 77 

3 1,3 



(8) 



Now define m = Tim ^ { 7' m2> - We have 

4 log* n 

2m < E^JjIet] 
3 

= E E Jj|et. 2 < m] Pr[£ < m|e t ] + E[£ i^, £ > m] Pr[^ 7, > m|e t ] 
j j j j j 3 

< E[y~^ Jj|e f , Ij < 777] + minjTn,!, 777 2 } log 2 77 Pr[^^ Ij > rhjef] 

< 777 + min{777i, 7772} log 2 77 Pr[y^ Ij > rh et\. 

3 

Therefore, 

* - n min{777i, 7772} 

e t > 



Pr 



4 log 77 



n 



4 log 77 



Finally, 



Pr 



t\ min{777i, 777,2} 
j > 4 log 4 77 



> Pr 



E l T\ min{?77i, 7772} 
j > 4 log 4 77 



> 



Tl 



4 log b 77 



1 - 2exp(-log 2 77)) > 



Pr[e t ] 

Tl 

5 log 6 77 



By setting tq = n/5, we get our result. 



□ 
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The next step is to characterize the growth rate at a larger scale. This requires more notation. We denote 
the set of b 3 subcubes of size £2 x ^2 x ^2 as £ = {Kj^ : i,j, k 6 [&]}. For an arbitrary subcube hij^, we 
define its neighbors as N(hij : k) = {^i',j',fc' : \i — i'\ + \ j — j'\ + \k — k'\ = 1}. In other words, hi'j'^' is 
a neighbor of hij^ if and only if both subcubes share a facet. Let % be an arbitrary subset of £ We write 

N(n) = {j hen N(h). 

Definition 4.3 (Exterior and interior surface). Let fi be a subset of The exterior surface ofH is &H = 
N(fi) — fi. Let ri be the complement ofH. The interior surface ofH is dri = N(H) — fi, i.e., the exterior 
surface of the complement ofH. 

At time step t = iAt, let Qt be the set of all subcubes that contain more than £2/2 infected agents and 
let g t = \Q t \; let B t = Gt be the rest of the subcubes and let b t = \Bt\. We say a subcube in Q t an infected 
(good) subcube and a subcube in Bt an uninfected (bad) subcube. 

We classify the agents in the process according to the subcubes they reside in. To facilitate our analysis, 
we adopt the notational system 2t t and 21/ to represent the total number of agents that belong to the type 
specified in the superscript. Specifically, let 2lf be the set of infected agents at time t; decompose the set 
%\ as %\ = %\ U 2tj , where W t is the set of infected agents residing in the subcubes in Q t and 21/ ' 
the set of infected agents in Bt- Similarly, let 21" be the set of all uninfected agents; decompose the set 
21" as 21" = 21" U 2l"' B , where 21" is the set of uninfected agents residing in the subcubes in Q t and 
21"' the set of uninfected agents in Bt- Furthermore, we denote A2lf and A2lf as the set of agents in Q t 
and Bt respectively that are infected between t and t + At. Hence the total increase in infected agents, or 
equivalently the total decrease in uninfected agents, between t and t + At is given by A2lf = A2lf U A2lf . 

Lastly, we let A2lf be the set of agents in Q t U dQt that are infected between t and t + At. 

Similar to the lower bound analysis, here we also introduce good density conditions that can be easily 
verified to hold with high probability, and reuse the symbols D t and D with slightly different meanings from 
the last section: 

Definition 4.4. Let {Dt : t > 0} be a sequence of binary random variables such that Dt = 1 if for all time 
steps on or before t, the number of agents for any subcube in V 3 with size £2 X £2 x £2 is between £2 and 
2£ 2 log 2 n. Also, let D = D n 2.s. 

The following lemma shows that D t = 1 with high probability, whose proof will be left to Appendix ICl 

Lemma 4.5. For any t < n 2 5 , Pr[Z^ = 0] < exp(— ^ log 2 n)for sufficiently large n. 

We now state two bounds on the growth rate of the agent types, one relative to the "boundary subcubes" 
dQt and one relative to the total agents of each type: 

Corollary 4.6. For some constant tq, 



Consequently, 



Pr 



|A2lf nASlfl > \dQ t 



T l 2 



41og 13 n 



■Ft, A = 1 



> T log" 



Pr 



|A2t?|>|3&|- 



r £2 



4 log 13 n 



T t ,D t = 1 



> tq log n and Pr 



|A2lf I > \dQ t 



r £ 2 



4 log 13 n 



Tt,D t = 1 



> T log" 



Corollary 4.7. We have 
^2 

Pr |A2tf| > 



'0 ig^S 



4 log 38 n 



> tq log n and Pr 



|AStf I > 



4 log 38 n 



121- 



Tt,D, = 1 



> T log" 
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The proofs of these two corollaries both rely on using coupled diffusion processes that have slower 
diffusion rates. These processes only allow infection locally i.e. within each "pair" of subcubes on the 
surface of Q t in the case of Corollary I4.6l and within each subcube in Corollary 14. 7 1 and hence can be tackled 
by Lemma 14.21 The surface dQt in Corollary 14.61 appears naturally from a matching argument between 
neighboring infected and uninfected subcubes. Roughly speaking, the bounds in Corollary 14.61 are tighter 
and hence more useful for the cases where infected/uninfected agents are dense in the infected/uninfected 
subcubes, while those in Corollary 14.7 1 are for cases where the agent types are more uniformly distributed. 

Proof of Corollary \4. 61 Agents in A2lf n A2lf are those initially in dQt at t and become infected at the 
time t + At. We focus on how the uninfected agents in dQt become infected. 

Let us construct a graph G = (V, E), in which the vertex set V of G consists of subcubes in dQt U dQt 
and the edge set is defined as 

E = {{u, v} : u G dQt, v G dQ t , u G N(v)}. 

We may use a greedy algorithm to argue that there is a matching on G from dQt to dGt with size at 
least |(?C?i|/ll (see Lemma IeTTI for details). Denote the matching as 

Tl = {{h 1 ,h' 1 },{h 2 ,h' 2 },...,{h k ,h' k } : hi G dQt, h\ = dQt}, 

where k > \dQt\/ll. We next define a coupling process with a slower diffusion rule: an infected agent can 
transmit virus to an uninfected one if and only if at time t the infected agent is in h 1 - and the uninfected one 
is in hj for some j. Let pj be the number of uninfected agents initially in hj at time t that become infected 

by time t + At under the slower diffusion rule. We have Y2j<k Pj at most I n A2lf | in the original 
process. We design the coupling in this way because pj s are independent of each other as they are decided 
by independent walks from disjoint pairs of subcubes. 

Next we apply Lemma l4~2l on each pair of the matching. Fix an arbitrary matched pair {hj, hj}. Since 
hj G dQt, at time t there are at least £2/2 uninfected agents in hj] similarly, since hj G dQt, there are at 
least £2/2 infected agents in h!-. At time t we can find a subset of uninfected agent A u in h!- and a subset of 
infected agents in hj such that \A U \ = \Af\ = £2/ log 2 n. Therefore, by Lemma [42] we have 

Pr[p, > I F t , A = 1] > r log" 6 n (9) 

log n 

for some constant To- From Equation [9] we can see that E [/?.,• I F t , D t = 1} > T 2 l 2 log- 10 n. Therefore, 

2 

nY^Pi I ?* D t = !] ^ ^IW2log- 10 n. (10) 

i<fc 

Next, we consider two cases. 
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Case 1., \dQt\ < log 9 n. In this case 



Pr 



> Pr 



> Pr 



|A2lf n A2tf| > \dQ t \ ■ 



|A2t? n AS(f| > log 9 n 



T £ 2 



4 log 1,3 n 



41og 13 n 



Ft,D t = l 
T u D t = 1 



|A2tf n Aaf| > 



4 log n 



TuD t = 1 



> Pr[pi > tq1 2 log 4 n | Ft, D t = 1] (only focus on an arbitrary matched pair in the matching) 

> tq log -6 n. (Lemma [ 



Case 2. \dGt\ > log 9 n. Notice that pi, pk are independent by construction. Also, we have 



\dG t 



-p^<f\dGAi 2 lo^n<\v 
41og n 21 I 



A2l? n AO? I 



F t ,D t = 1 



By a Chernoff bound (the version we use is Theorem IA. 1 1 with 5 = 1/2), we have 



Pr 



|A2lf n AS(f| < \dG-, 



T £ 2 



t\ ' 



4 log 13 n 



FuDt = 1 



( (i) 2 E 



< 2 exp 



|A2lf n AOf| 



Ft,D t = 1 



< 2 exp (-J£-.\dg t \£ 2 log -10 n^j (using Equation[l0l) 

< 1 — ro log -6 n 

for sufficiently large n. Our corollary thus follows. □ 

Proof of Corollary \4. 71 Let us start with proving the first inequality. We first couple the diffusion problem 
with a slower diffusion process defined as follows. First, all the infected agents in Bt at time t cannot 
transmit the virus. Second, agents in Q t are able to transmit the virus to each other if and only if at time t 
they are in the same g for some g G Q t . For an arbitrary g £ 5 t , we let 21]^ be the set of uninfected agents 

in g at time t. Accordingly, let A2lf be the set of agents in 21^ that become infected at t + At under the 
slower coupled process. By Lemma l4~2l we have 



Pr 



I AO? 



■3 



> 



r min{4/(log 2 n),|2l"f|} 



4 log n 



> TQ log" 



n. 



(11) 



Note that Lemma |4~2~1 requires that both the number of infected agents and the number of uninfected agents 
are at most ^2/(log 2 n). By Q t 's construction, there are at least £ 2 /2 infected agents in each subcube, and 
we may choose an arbitrary subset of them with size l%j log 2 n to form A$ in Lemma [4721 We do not know 



22 



the exact size of |2l"^| but in case |2t"^| > £2/ log 2 n, we let A? be an arbitrary subset of 21"^ with size 
£2/ log 2 n. 



When A = 1, we have |2l"f | < 2£ 2 log 2 n. Therefore, 



minU 2 /log 2 n,|^f |} > minji-^, |2l£f |} > |2t^|/(21og 4 n) 



iU,Q I 



2 log n 



Equation [TT]can be rewritten as 



Pr 



u,Q I 



TuD t = 1 



81og s n 

This also provides a lower bound over the expectation of | A2lf |, i.e., 

E[|Aa^||^,A = i]>^|sagf|iog 



> tq log n. 



11 



n. 



We also have 



E[|A2lf| I T U A = 1] =5>[|A2lg s | I T U A = 1] > £ log^ 14 n = ^|2t^| log" 14 n. 



(12) 

Furthermore, by the way we design the coupled process, the random variables | A2lf | are independent given 

J*,A = l. 

We consider two cases. 

Case I. There exists g e G t such that |2l"f | > |2l"' 5 |/ log 29 n. In this case, we have 



Pr 



|A2lf I > ToK ' 5 



4 log 38 n 



Tt,D t = 1 



> Pr 



|ACl > 



ro|2l^| 



4 log n 



^,A = l 



> tq log n. 



Coie 2. For all g G |2t"f | < |2l" ,g |/ log 29 n. Observe, on the other hand, that V |2t"f | = |2l?' e |. 



In this case, we have the summation Y2 g \^t'g\ < |2l" | log n. (and it is maximized when every 
non-zero |2l"'5| is exactly |2l"' | log -29 n). We therefore have 



£ |< | 2 < |2t"' 5 |log- 29 n ^ |2t^f I = |2l^| 2 log- 29 n. 



(13) 



23 



Next, by Hoeffding's inequality (See, e.g., Theorem lA.3l >. we have 



Pr 



< Pr 



I AO? I < 



|A2lf | < 



T o\&t I 
41og 38 n 



•Ft, A = 1 



E[|A2lf | \F t ,D T = l] 



F t ,D t 



Pr 



E[|A2lf| \F t ,D T = l] 



F u D t = 1 



< 2 exp 



2(±E[|A2lf| | F u D t = l]f 



V I9l"' e l 2 



(apply Hoeffding's inequality; we have |A2lf 5 | < |2l"^|) 



< 2 exp 



2 1 T J ! |gi] x ' g|2l —~ 2N 



'log' 
a^piog- 29 

exp(— 0(logra)) 



n 



(by Equation [12] and Equation [T3l) 



n 



< 1 - T log" 



n, 



for sufficiently large n. 

Proving the inequality regarding | A2lf | is similar. We provide it here for completeness, but less patient 
readers may simply skip this part. We first couple the diffusion problem with a slower process. First, all 
the infected agents in Q t at time t cannot transmit virus. Second, agents in Bt are able to transmit virus to 
each other if and only if at time t they are in the same b for some b G Bt- For an arbitrary b G St, we 
let 2l;f' fc 6 be the set of uninfected agents in b at time t. Accordingly, let A2lf b be the set of agents in 21^ 
that becomes infected at t + At under the coupled process. For technical reasons, we require the slower 
diffusion in the subcube b to halt when | A2lf b | becomes large, i.e., | A2tf b 

|2l^' fe | allows us to apply Hoeffding's inequality in an easier manner. 



f 3 

W t ' b | . This added constraint 



When A 
we have 



1, 2l(f < 24 log 2 n and min{|2tff ^(log^n)} > |2lf',°|/(21og 4 n). By Lemma 



r/,6 



f,B\ 



l f.6 



Pr 



Similar to the analysis of 21"^, this inequality holds because we can always restrict to a subset of agents if 
the number of infected/uninfected agents in the subcube is too large to meet the requirement in Lemma |4~2] 
We also have 



|A2t 



t,b\ 



> 



! log n 



F t ,D t = l 



> T log" 



n. 



T t ,D t = 1} = 5>[|ASl» | | F u D t = 1] > ^|2lf< B |log- 14 n. 



(14) 



Furthermore, by the way we design the coupled process, the random variables | A2tf b | are independent given 
T u D t = \. 

We consider two cases. 

Case 1. There exists an b G Bt such that \^{f\ > |2l/' B |/ log 29 n. In this case, we have 



Pr 



IASC 



> 



4 log 



f,B\ 



38 



F,D t 



> Pr 



IA21: 



n 



Lb 



> 



ro\*{?\ 
4 log 9 n 



F t ,D t 



> TO log" 



n. 
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Case 2. For all b e B u |2tff | < log 29 n. In this case, we have 



^|2lff| 2 <|2l^| 2 log- 29 



n. 



(15) 



beB t 



Next, by Hoeffding's inequality (again by Theorem lA.3l >. 
we have 



Pr 



< Pr 



f,B\ 



|A2lf | < 



|A2lf | < 



F t ,A = i 



4 log 38 n 
E[|A2lf| |Ft,Ar = l] 



-Ft, A = l 



Pr 



t,b\ 



< 



E[|A5lf| | Ji,D T = l] 



Ft, A = 1 



< 2 exp 



< 2 exp 



2(|E[|A2tf| | T t ,D t = l]f 



(apply Hoeffding's inequality; we have | A2lf 9 | < |2l/',f | by construction 



24l# B l 2 log- 28 



'162 \**t 



n 



2lP| 2 l0£ 



-29 



(by Equation [141 and Equation [TSl 



n 



< 1 — tq log n. 



□ 



4.1 Leveraging local analysis 

We now move to the global diffusion upper bound. As discussed in the beginning of this section, the 
balance between the distributions of each type of subcube and the distributions of actual agents plays a 
crucial role in our analysis. Fix an arbitrary time t, we classify the joint configurations of the agents into 
four types: 



type 1 (namely P 1>t ): when \Q t \ < \{{2n + l)/l 2 f and |2lp| > 
type 2 (namely V 2 ,tY when \Q t \ < \({2n + and |ap| < \\%{ 



type 3 (namely P 3>t ): when \Q t \ > \({2n + l)/hf and |2t 



u.Q I 



< 



I a? I 



• type 4 (namely 7\ t ): when \Q t \ > \({2n + l)/l 2 f and |2l"'^| > 

Recall that Ft refers to the information on the global configurations up to time t. We shall abuse notation 
slightly and say Ft G P^t if the configuration of the agents at time t belongs to the tth type described 
above. Notice that Ft belongs to exactly one of the sets Pi t t, Pi,u 'Pz,t-)'P&,t- m brief, scenarios "Pi t and 
Pi,t have a majority of uninfected subcubes, while Pz,t and P^t have a majority of infected subcubes. From 
another perspective, P\ t and p$ t refer to situations when the dominant types (with respect to the status of 
infection) are dense in their subcube types (infected/uninfected subcubes), while P 2) t and P^t refer to the 
more uniform scenarios. The next lemma states that when Ft G Pij U ^2,t, the total number of infected 
agents |2lf| grows in proportion to a monotone function of |2lf| within At steps. On the other hand, when 
-F G Pz^V^.t, the total number of uninfected agents |2l"| is reduced in proportion to a monotone function 
of |2l"| within At steps. 
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Lemma 4.8. Fix an arbitrary t, define the following events, 



ei(i) 
ea(*) 



lAStfl > 0.09t 



[gjl 



2/3 



> 4f 2 log 2 n y l°g n 

|A2l t | > 0.015t f J'5l ) 2/3 t-4- 

1 11 — V 4 ^2 log 2 n J log 1 ' 1 



e 2 (t) = {|A2l t |>3^|2t(|} 
e 4 (t) = { | ASC*| > MoF^l^l} 



fori = 1,2,3,4. 



Pr[e* | € Vi,t, D t = 1] >r log 



-6 



n 



Intuitively, e\ and e 2 connect the number of newly infected agents to the original number of infected 
agents. When e\ or e 2 are triggered sufficiently many times, the number of infected agents doubles. Mean- 
while, e3 and connect the number of newly infected agents to the original number of uninfected agents. 
When e% or are triggered sufficiently many times, the number of uninfected agents halves. 

The key to proving Lemma 14.81 which will ultimately lead to a bound on the global growth rate of 
doubling/halving the total number of infected/uninfected agents as depicted in the next proposition, is a 
geometric relation between the boundary of Gt, i.e. dGt, and Gt itself. More specifically, an isoperimetric 
bound on Gt guarantees that no matter how packed together these good subcubes are, there are still an order 
|£/t| 2 ' 3 of them exposed to the bad subcubes, hence the global infection rate cannot be too slow. 



Proof of LemmaWM Parti. Vi, t , \Gt\ < \{{^n+l)/t 2 f and |2l^| > ±|2t[|. Since A = 1, the number of 
agents in each subcube is at most 2^2 log 2 n. Therefore, \Gt\ > |2l/ 1/(2^2 log 2 n). To apply Corollary 14.61 
we need to derive a relationship between the size of Gt and the size of dGt- This is an isoperimetric problem 
studied by [U (see Appendix [D] for details). By Theorem 8 in H or Theorem ID. 2 1 in the appendices, 



\dGt\ > 0.36|^ t | 2/3 > 0.36 



f,5\ 



2/3 



2^2 log 2 n 



We have 



Pr 



> Pr 



> Pr 



|AStt| > 0.09r I 
|A2tf | > 0.09t 
|A2tf I > \dGt 



, f, \ 2/3 



4£ 2 log 2 ra/ log 13 n 



2£ 2 log 2 n J log 13 n 



F t eV u ,D t = \ 



?t£V u ,D t = 1 



4 log 13 n 



Ft£V u ,D t = 1 



> ro log 6 n (by Corollary |4.61 i 
Part 2. \G t \ < i((2n + l)/i 2 ) 3 and < \ \ %[\. Notice that |2tf' B | > \%{ |/2 and | A2tf | < |A2t t |. 



We have 



Pr 



IA2LI > 



1 log 38 n 



J r t£V 2 ,t,D t = 1 



> Pr |AStf | > — - JL^\^ B \T t eV 2 , t ,D t = l 

4 log n 

> tq log -6 n (by Corollary I4.71 i 
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Part 3. \Q t \ > |((2n + l)/l 2 ) 3 and |2l"' g | < This is similar to part 1. We have 



\dg t \ = \dB t \ > \dB t \/6 > 0.06|£ t | 2/3 > 0.06 



u,B\ 



2/3 



2£ 2 log 2 n 



The second inequality holds because the exterior surface dB t is in the neighborhood of dB t and \N(dB t 
Q\dB t \. Notice that |2l"' B | > |2^|/2. We have 



< 



Pr 



> Pr 



> Pr 



|ASSt| > 0.015r 



|A2lf | > 0.015r 



|A2lf I > \dG t 




' 4 log 13 n 
> tq log -0 n (by Corollary | 

Part 4. \g t \ > i((2n+l)/i 2 ) 3 and |2l"' e | > i|2t^|. This is similar to part 2. Notice that |2^| < 2\^' G \ 



and |A2lt| > |A2lf |. We have 



Pr 



81og 38 n 



Tt £Vu,D t = 1 



> Pr 



|A2lf | > 



121 



u,Q I 



> T log" 



n. 



4 log 38 n 
(by Corollary gj} 



□ 



Our major proposition presented next essentially pins down the number of times these events need to 
be triggered to double the number of infected agents or halve the number of uninfected ones. 

Proposition 4.9. Consider the information diffusion problem over V 3 with m agents. For any fixed t < 



n 



2.5 



log 45 nAt, define the following events 



xi W 



2l J 



2|2t/| ) andxzU) 
Pr[xi(t) V X 2(t)\ > 1 - exp(- log 2 n 



t+4»/ffi lo; 



-i At' 



We have 



l 2l r+4 v /^log 45 nAtl - 2^ 



Note that this bound suggests that for each time increment 4y^log 45 nAt, either the number of in- 
fected agents doubles or the number of uninfected agents is reduced by half with high probability. Therefore, 
within time at most 2 log n ■ (4y^log 45 nAt) = 128n^2 log 47 n all the agents get infected with probability 
at least 1 — 2 log n exp(— log 2 n). This proves Theorem 14.11 

To summarize our approach, Corollaries 14. 6 l and |4~7l first translate the local infection rate of Lernma |4T2l 
into a rate based on the subcube types (i.e. good and bad subcubes). Then Lemma |4T8l further aggregates the 
growth rate to depend only on the infected and uninfected agents, by looking at the geometrical arrangement 
of the subcubes. Nevertheless, the bound from Lemma [478] is still too crude, but by making a long enough 
sequence of trials i.e. 4y^log 45 n times, at least one of the four scenarios defined in Lemma I4T81 occurs 
for a significant number of times, despite the r2(log _6 n) probability of occurrence for each individual step 
for any of the four scenarios. This leads to the probabilistic bound for xi(^) V X2{t)- 
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Proof of Proposition \4. 91 First, notice that Lemma l4~8l states that regardless of the diffusion process' history, 
one of e%, ei, e%, and is guaranteed to occur with 0(1) probability. On the other hand, it is not difficult to 
see that when any of the events e\, e^, occurs fi(ra/£ 2 ) times, then either |2lf | doubles or |2l^| reduces 
by a half. Although we do not know exactly which event happens at a specific time t, we argue that so 
long as we wait long enough, the collection of events {ei, ...,64} occurs 4Q,{n/l2) times and by Pigeonhole 
principle, at least one of e±, ei, e^ will be triggered (l{n/l2) times, concluding the proposition. The 
above argument can be made rigorous via a Chernoff bound on the total number of occurrence of all four 
events. 

We now implement the idea. Let us define q = 4y^k>g 45 n. Let t j = t + (i — l)At for i G Note 
that ti depends on both i and t, but we suppress the dependence on t for succinctness. This also applies to all 
other defined quantities in this proof. Recall from Lemma l3.13l that Ft t encodes all the available information 
up to time ij. For each i G [q], define the following pairs of indicator functions 



h,2(U) 



otherwise 



u v. 



U,2 



and h^{ti 



Notice that for arbitrary ti, h^{ti) + ^3,4(^1) = 1- Next define 



121: 



1 if^ er u , 3 ur UA 

otherwise 



<fi = h,2[H) ■ —f + -'3,4 (ti' 



ON ' 



We first show a lower bound for X^i< f V 9 *- Our strategy is to invoke Lemma |4~8l and apply a Chemoff 
bound. Special care needs to be taken when D = 0. 
Let 



r = mm < 



0.015r 



1 



4m log 2 n I log 16 n ' 8 log 38 n 
By Lemma 1481 we can see regardless of whether F^ belongs to Vijn or 

> r I T U ,D U = 1] > r log- 6 n. 



(16) 



where tq is the constant specified in Lemma 1481 Here, we verify the case F^ G Viti f° r Equation[[6] The 
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computation for the other three cases can be carried out similarly. 

Pv[^>r\T ti eVi, ti ,V ti = l] 



> Pr 



> Pr 



> Pr 



> Pr 



> Pr 



|A2l f | 0.015 

> — ~ 7"0 



|2l/| log 16 ra I 4m log 



n 



T u GV hu ,D u =l 



log 16 n I 4m log 2 



T ti eP 1>ti ,D ti = l 



|AOt| > - jg tq 



2/3 



|AStt| > — ^r 



log 1D n \4£ 2 log 2 n / 
0.09 / ISli 

= — Tn — ; 



|2lf| V3 (4i 2 log 2 n) 2 / 3 



<m 



4m log n 



2/3 



-m 1 / 3 (41og 2 n) 2 / 3 



|A2l t | > 0.09r 



log 3 n \ U 2 log 2 n ) log 13 n 

\ 2/3 



1 \ 5 



4m log n 



4£ 2 log 2 n / log 13 n 



> T log 6 n (Lemma |4~8l) 

Next, let us define a family of indicator random variables : i < ?} so that is J 7 ^ -measurable 



and 



I(i) 



1 if fi > r 
otherwise. 



Notice that ^ i<<; > r ( ^ i<? 7(j) ) - By Equation [T6l we have 



Pr[J(i) = 1 | Ji,, A* = 1] > r log" 6 n. 



Next, let us introduce another family of r.v. {!'{£) ■ i < ?} to incorporate the good density variable as 
follows: 

7(i) ifD ti = l 

1 otherwise 



Since > 7(i), we also have 



On the other hand, by construction 



This concludes that 



which implies 



E[l'(i) \F ti ,D u = 1] >r log- 6 n. 

E[7'(i)| J- ti ,A l =0] = l>log- 7 n 
E[7'(z) | JiJ >log- 7 n 
E[^7'(i)]>?log- 



re. 
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We construct a sequence of random variables {&} such that £o = and = + + 1) — 
E[J'(i + 1) | -T 7 ^]). We can verify that & is a martingale with respect to {J 7 ^} and |£j — < 2 for all i. 
By Azuma-Hoeffding's inequality (see Theorem lA.4l ). 



Pr 



|^|>^EE/'(i)] 



< 2exp 



1 



< 2exp(— -^—q\og 14 n) < exp(— log 30 n). 



This implies 



PrE A*) < 7^0 log" 6 n] < exp(-log 30 



n . 



Next, notice that 

PrE^W + E J W1 ^ Pr[A+( f -i)At = 0] < Pr[D = 0] < exp(-^log 2 n) 



where the last inequality follows from Lemma 1431 
We conclude that 



Pr[\^I(i) < prolog 6 n] < exp(-log 30 n) + exp(- — log 2 n) < 2exp(-— log 
^-^ 4 15 15 



n ). 



Finally, we show when ($~^< ? i"(£) > |?rolog _6 nj occurs, either xi(*) or X2{t) is true. First, we 

r • ■ 

guarantee on | A2l t | . Specifically, we have 



have a lower bound <p; > r ■ , > 2. Next, we show this lower bound results in a minimum 

^•y rl — 41og b n — ' 



Kg 



which implies either 



V/i^^)^^ > UCase 1) 



or 



E 7 3,4(^^T^ 1(CaSe2) 



Case 1. Observe that 



1 21 



/ _ 

t+4 A /^log 45 At 



o/| = X>a*,| 



30 



because A2l/. are all disjoint for different i. We have 

I21 7 ,— ^ -2l/ 1 > V|A2t t .| 
I t+4^/flog 45 At tl - Z^ 1 * i1 



> ^/i, 2 (ti)|A2l t 



i<<; I4{ I 

- I 21 ^ ( ^ J i,2( f i) /*' ) O^fl is non decreasing w.r.t. i) 




> mi 



In this case, the event xi (t) occurs. 

Case 2. When 21^ = 0, nothing needs to be proved. Let us focus on the situation where 21^ 7^ 



U 

'AO*. 



%<<; 1 



^ \K I ( S J 3,4(^)^r ) <M is non increasing w.r.t. i) 



□ 



Therefore, the event X2{t) occurs in this case. 

5 The case when the number of agents is sparse 

This section focuses on the case where m = o(n): 

Proposition 5.1. Let ai, &2, a m Z?e placed uniformly at random on V 3 , where m < nlog -2 n. Let ai &e 
?/ze agent that holds a virus at t = 0, awcf T be the diffusion time. We have for any constant c > 0, 

n 3 



Pr[T < — log c n]< log c n 
2n 3 



m 

and 

2re 3 

Pr[T > log 15 nl < exp(-(log 2 n)/2). 

m 

Note that our analysis in Section [3] and Section 0] cannot be applied directly to prove Proposition 15.11 

because we required the side of each subcube to be of length £2 = \J~^, which is larger than (2n + 1) when 
m = o(n). The diffusion time for this case turns out to depend on m and n in a way different from the case 
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where re log 2 n < m < n 3 . Nevertheless, some of the arguments can still be borrowed from Lemma [ 
together with the use of mixing time of a random walk in V 3 . Because of the similarity of our analysis with 
previous sections, we only sketch our proof and highlight the new main technicalities. 
We first show the lower bound of the diffusion time: 

Lemma 5.2. Let ai , &2 , • • • , a m be placed uniformly at random on V 3 , where m < In + 1. Let &\ be the 
agent that holds a virus at t = 0. Let T be the diffusion time. We have, for any constant c > 0, 

T7, 3 

Pr[T < — log~ c re] < log~ c n 
m 

Proof. Let these m random walks be S , S 2 , S m . Since each random walk is already at stationary 
distribution at t = 0, they are all distributed uniformly at any specific time. Therefore, for any fixed t and 
fixed j > 1, PiiH^ 1 - £f ||i < 1] < 7/(2n + l) 3 . By a union bound, 

3 3 

Pr[3t < — log~ c n,i > 1 : \\S} - Sfh < 1] < " - • m • 7(2n + 1)~ 3 < log~ c n. 
m mlog n 

Therefore, with probability at least 1 — log~ c n, S 1 will not meet any other agent before t = log _c n, 
which also implies that the diffusion process has not been completed. □ 

Next we move to the upper bound: 

Lemma 5.3. Let ai , a2 , . . . , a m be placed uniformly at random on V 3 , where m < " . Let &i be the agent 

log n 

that holds a virus at t = 0. Let T be the diffusion time. We have 

2n 3 

Pr[T > log 15 nl < exp(-(log 2 n)/2). 

m 

The following is a key lemma for the upper bound analysis. The lemma reuses arguments that appealed 
in Lemma l4~2l However, as the agents are sparser in this case, new diffusion rules for the coupling process 
and the corresponding probabilistic bounds are needed. 

Lemma 5.4. Consider the diffusion process in which m < j ^a w - Fix a time t, and let A$ and A u be the 

set of infected and uninfected agents at time t with \A? \ = mi and \A U \ = m% Let cbe a sufficiently large 
constant and At = cn 3 (log n)/m. Let M(t) be the number of newly infected agents from time t to t + At. 
Assume the agents are arbitrarily (in an adversarial manner) distributed at time t. We have 



Pr 



. . minimi , mo} 

M(t) > } I' 2J 

log n 



1 , -5 

> - log n. 



Proof. Similar to the proof of Lemma 14.21 we first count the number of times the infected agents meet the 
uninfected agents. We then show that this number is close to M(t) by demonstrating that the number of 
overcounts is moderate, which yields the desired result. The device we use to count the number of meetings, 
however, is different from the one we used for Lemma |4~2] In Lemma l4~2l we couple each of the walks in 
V 3 with their unbounded counterparts; since we only focus on a short time frame, the bounded walks largely 
coincide with the unbounded ones. Here, the right time frame to analyze is longer and the walks in V 3 are 
more likely to hit the boundary. It becomes less helpful to relate these walks with the unbounded ones. Our 
analysis, instead, utilizes the mixing time property of V 3 . 
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Specifically, we cut At into disjoint time intervals, each of which is of size cn 2 log n steps for some 
constant c to be determined later. We refer the A;-th time interval as the A;-th round. The total number of 
rounds in At steps is thus n/m. 

We couple the diffusion process with a slower one. First, only agents in A* are allowed to transmit the 
virus. An agent in A u will not be able to infect others even if it becomes infected. This rule holds throughout 
the At time increment. 

In each round, we also impose more specific constraints on the diffusion rule as follows. At the begin- 
ning of the k-th round (for any k), we first wait for cqu 2 log n steps so that the distribution of each agent is 
l/(16n 3 )-close to uniform distribution (see Definition IA. 141 and Lemma lA.15l for details; Co is an appropri- 
ate constant that exists as a result of Lemma lA.151 ). Within these time steps, no agent becomes infected even 
if it meets a previously infected agent. After these steps, for an arbitrary a; G A$ and &j G A u , let Xf ■ = 1 
if both of the following conditions hold: 

• the Li-distance between a^ and &j is between n/450 and n/500. 

• the Li-distance between aj and any boundary is at least n/20. 



Since cqu 2 log n is already the mixing time for random walks on V d , it is straightforward to see that 

probability X k { j = 1, 
After c^n 2 log n steps at kt 
a,j G A u at the kth round only if 



with £7(1) probability X k - = 1, for any k 

After cqu 2 log n steps at Mi round, our slower diffusion rule allows a^ G A* to transmit its virus to 



• X k - = 1. 

• aj meets aj after the waiting stage and before the round ends. 

• &i and &j have not visited any boundary after the waiting stage before they meet. In other words, an 
agent a^ G A? (&j G A u resp.) loses its ability to transmit (receive resp.) the virus when it hits the 
boundary. 

Let Y k - be an indicator random variable that sets to 1 if and only if aj G A? transmits its virus to &j G A u 
under the slower diffusion rule at the kth round, pretending that a^ is uninfected at the beginning of the k-th 
round even if it gets infected in the previous rounds. Hence Y^j, for a specific i and j, can be 1 for more 
than one k. This apparently unnatural definition is used for the ease of counting in the sequel. 
By Lemma 1231 



Pr[*$ = 1] > Pr[*£ = 1 | X« 4 = 1] Pr[X& = 1] = fi(l/n). 
Therefore, we have 

E 



0H>g, (17) 
\ m J m 



for some constant t\ . 

We briefly lay out our subsequent analysis. We want to show two properties: 



1- I''->:„a.^ = n(min{m 1 ,m 2 })] = 17(1). 
2. For all j, k Y k - = O(l) with high probability. 

We claim that these two properties together concludes our result. Roughly speaking, when (j^ijk^j = ^(min{mi, ^2}) 
and (vj : k Y k - = 0(1)^ occur, each aj G A u meets at most O(l) agents in A$ while the total number 
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of meetings between infected and uninfected agents is mm{mi, m^}. Consequently, the total number of 
uninfected agents that ever meet an infected agent is 0(min{mi, m®}), hence our conclusion. 

To prove the first property, we need to show with high probability, for any j, we have ^ k Y^- = O(l). 
Similarly, we also need to show with high probability, for any i, Y^jk^tj = 0(1). Combining both of 
these we have Y^ijk^tj = 0(min{mi, ma}) with high probability. Together with Equation [P7] some 
rearrangement of terms and Chernoff bounds, we can conclude that Pr[]^ • k Y^- = f2(min{mi, ^2})] = 

n(i). 

We now carry out this scheme. We proceed to show that 



i.k 



0(1)] > l-exp(-0(log 2 n)). 



(18) 



and note that showing ^ ■ k Y^- = O(l) can be done similarly. We prove Equation [18] via the following two 
steps: 



1. first, we show that with high probability, ^ = O(l) for any fixed k and j. 



2. second, we show that with high probability, the number of k's such that J2i Y^- > is O(l) for all j. 



Intuitively, the first step ensures that there will not be too many meetings associated with a, for any single 
round. The second step specifies an upper bound on the number of rounds in which a 3 - meets at least one 
infected agent. When both event occurs, the total number of meetings for aj is O(l). 

Let us start with the first step. Fix a specific k and aj £ A u , by Corollary 12.51 we have 



Pr 



X 



h.1 



< 



< 



• ' X i,j 



log 2 n 



c\ log n 



2 \ lo g 2 n 



mi 
log 2 n 



c\ log n 



n 



n 

log 2 n 



< exp ( — log 2 n log log n) . 



By a union bound, we can also conclude that 



Pr 



< exp(— - log 2 n log log n) 



(19) 



Next, let us move to the second step. Let us define a family of indicator random variables I(J, k), 
which sets to 1 if and only if YlmeAf ^ij — When j and k are fixed, we can compute the probability 

Pv[I(j,k) = 1]: 



Pr[/(j,A;) = l]=E[J(i,A;)]<E 



E Y "j 

a.i£Af 



< 



T\m\ 



n 



The probability holds regardless of the history of the process up to the time the A;th round starts because 
con 2 log n time steps are used at kth round to shuffle the agents so that they are distributed sufficiently 
uniform after these steps. We may apply a special case of Chernoff bound (see, e.g., Theorem IA.2b to show 
that Pr[X] fc k) > log 2 n] < exp(- log 3 n). 
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Therefore, we have 



Pr 



3a,j G A u : ^ k) > log 2 



n 



k<- 



< exp(-G(log 2 n)). 



(20) 



For a specific a.j G A u , when both (J2k ^) < 1°§ 2 n ) an( ^ ( ^ • J2i Y tj — 1°S 2 n J > we know that 
£ ie A/,fc y ij ^ lQ g 4n - Hence Equation [H and |2Q] imply PrE ieA/ fc yA > log 4 n] < exp(-9(log 2 n)) 



and therefore 



Pr 



3*1 € A": Yl Y *i> 1 ^ 



n 



< exp(— 0(log 2 n)) 



(21) 



Similarly, we can show 



Pr 



3a, 6^: £ !&>log 4 n 



< exp(— 6 (log 2 n)) 



(22) 



Equation |2T] and [22] yield 



Pr 



Y^j < min{mi, 7712} log 4 n 



> 1 -exp(-9(log 2 n)). 



(23) 



This gives the first property in the discussion following Equation [TTJ Moreover, Equation [21] gives the 
second property. 

Now, by using similar argument in the proof of Lemma 14. 21 Equation [T7] and [23] together give 



Pr 



i,j,k 



T\ min{mi, 7722} 



> Pr 



> 



i,j,k 



T\ ra\m2 
2m 



> log 5 n 



(24) 



When (Eij,*iy > rimin{ 4 mi ' m2} ) and (va, G A" : Ea ie A/,fc^j < log 4 ™), the total number of in- 
fected agents is at least — ^^I^f" 2 ^ - Hence, by setting c = 2co, and using Equation [21] and [24] our 
lemma follows. □ 



From this we can mimic the argument that appeared in Proposition |4.9| to reach the conclusion below: 

Corollary 5.5. Consider the diffusion process in which m < lo ^ n - Fix a specific time t, and let A^ and 
A u be the set of infected and uninfected agents at t such that \A*\ = m\ and \A U \ = 777,2- Let M(t) be the 
number of new infected agents between time t and time t + j^ log 14 77. Assume the agents are arbitrarily (in 
an adversarial manner) distributed at time t, we have 



Pr 



M(t) > min |tt7i, > 1 — exp(— log 2 n). 



Similar to Lemma 14.91 Corollary 15.51 estimates the growth rate of infection as either doubling the 
number of infected agents or halving the uninfected ones within a certain time interval. One can then show 
that this implies Lemma 1531 The argument is analogous to Section [4] and hence is skipped here. 
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A Probability Review 

This section reviews some probabilistic building blocks that are needed in our analysis. 
A.l Concentration bounds 

Theorem A.l (Chernoff bounds). Let X\, ...,X n be independent Poisson trials with Pr[JQ] = p{. Let 
X = Y^i< n and fJ- = E[X]. Then the following Chernoff bounds hold: 

• For < 6 < I, 

Pt[\X -fi\> S/j] < 2exp(-/x<5 2 /3). 

• For R > 6/i, 

Pr[X >R}< 2~ R . 

Theorem A.2 (Chernoff bounds for dependent variables). Let X±, X n be possibly dependent Poisson 
trials with PrpQ = 1 | X\, > p. Let X = ^2 i<n Xi and [i = np. Then the following Chernoff 

bound holds: 

• For < 6 < I, 

Pv[X < (1 - S)p] < exp(-/i5 2 /2). 
On the other hand, ifPr[Xi = 1 | Xi, X-i^i] < p, the following bound holds: 

• For any 5 > 0, 

Pr[X > (1 + S)p] < exp(-^ 2 /4). 

Theorem A.3 (Hoeffding's inequality). Let X±, X2, X n be independent random variables such that 
a-i < Xi < hi. Let S = ^2 i<n Xi. Then 



Pr(\S-E[S}\ >t)< 2exp 



2t 2 



Ei<n(^ ~ a i) 2 ) ' 

Theorem A.4 (Azuma-Hoeffding inequality). Let X\, ...X n be a martingale such that 
Then, for all t > and any A > 0, 



Pv[\X t -X \ > A] <2ex P (-A 2 /(2^c 2 )). 



i=l 
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A.2 Random walks in bounded and unbounded spaces 

In this subsection we state some peripheral results for bounded and unbounded random walks that will 
be useful in handling the meeting time and position of multiple random walks in the next section. 

Theorem A.5. (First Passage Time, Chapter 3 of Q) Let {St : t G N} be a one dimensional random walk 
from the origin. The probability ip rt that the first passage through r occurs at time t is given by 



Lemma A. 6. Let S be a bounded random walk in [—n, n] starting from position P. For any other position 
Q G [—n,n] with \P — Q\ > log 2 n, the probability that S visits Q within \P — Q\ 2 / log 4 n time steps is at 
most exp(— log 3 n) when n is sufficiently large. 

Proof. Let us couple S with an unbounded random walk S' that also starts at P in the natural way, i.e. S 
and S' share the same random tosses to drive their moves. 

First, we claim that at the first time S visits Q, the number of distinct lattice points S' visits is at least 
\P — Q\. This claim can be seen through analyzing the following two cases. 

Case 1. The walk S never visits a boundary before its first visit to Q. In this case, S' coincides with S, 
which implies S' also visits all the lattice points between P and Q. The claim therefore follows. 
Case 2. The walk S visits a boundary before it fist visits Q. In this case, the boundary that S visits and 
the point Q lie on different sides of P. In other words, the distance between this boundary and Q is at least 
\P — Q\. Now let us only consider the time interval between the last time S visits the boundary (namely, to) 
and the first time S visits Q. The trajectory of S' within this time interval is identical to the trajectory of S 
(up to an offset produced between time and to)- Therefore, from to to the first time S visits Q, the coupled 
walk S' visits at least \P — Q\ distinct lattice points. 

An immediate consequence of our claim is that a necessary condition for S to visit Q is that S' has to 
visit either P — l P ~^l or P + l P ~^L By Theorem IA.5 1 the probability S' ever visits either of these points 
within time \P — Q\ 2 / log 4 n is at most exp(— log 3 n) when n is sufficiently large, which completes our 
proof. □ 

The next lemma concerns the first passage time for a random walk over bounded space. 

Lemma A. 7. Let S be a random walk on V 1 = {— n, n} that starts at A. Let B be a point on V 1 such 
that \B — A\ = r. Let T be the first time S visits B. Fix an arbitrary constant c, we have: 



Proof. Without loss of generality, let us assume — n < A < B < n. We couple S with an unbounded 
random walk S' that also starts at A in the natural way, i.e. having S and S' share the same random tosses 
to drive their moves. Let T be the first time S' visits B. We first show that T > T. Note that before T', S' 
is always to the left of B, and hence n. It is then easy to see that S is always overlapping or to the right of 
S' before T'. Hence S' hitting B at T' implies that S has already hit it at a time before or at T' . 
Finally, by Theorem IA.5 1 we have 




Therefore, there exists constant C, such that for r,t > C, we have <p r j £ (f ~ff e r2 ^ 2t \ -j^e r2 /( 2 *)). 



Pr[T < cr 2 } = (1(1). 



Pr[T < cr 2 } > Pr[T' < cr 2 } = (1(1). 



□ 
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Corollary A. 8. Let S be a random walk on V 1 that starts at A, and B be a point on V 1 such that \B—A\ = r. 
Let c\ and c 2 be two arbitrary constants and let t = c\r 2 . We have 

Pv[\S t -B\ < car] = 0(1). 

Proof. Let T be the first time S visits B. By Bayes' rule we have 

Pr[| S t -B\< c 2 r] > Pv[\S t - B\ < c 2 r | T < t] Pr[T < t] = Pr[\S t - S T \ < c 2 r \ T < t] Pr[T < t] 

By Lemma \AJ\ Pi[T < t] = 0(1). Next we claim Pr[\S t - S T \ < c 2 r \ T < t] = 0(1). This 
can be seen by showing Pr[|5t — St\ < C2r|T] = 0(1) uniformly over T G [1,0- F° r trns > n °te that 
Pr[|5t — St\ < C2r|T] = Pr[|5 T | < c 2 r] where r = t — T and S 1 is a random walk starting at 0. We 
then write Pr[|5 T | < c 2 r] = Pi[\S T /</¥\ < c 2 r/^F] > Pr[\S T /^\ < c 2 /^/ci] = 0(1) by Gaussian 
approximation on S T /y/r. Therefore, Pr[|5t — B\ < c 2 r] = 0(1). □ 

For a d-dimensional unbounded random walk starting from the origin, let pd(t, x) be the probability 
that the walk visits position x at time t. Let qd(t, x) be the probability that the random walk visits x within 
time t. When d = 3, we will silently drop the subscripts and write the functions as p(-, ■) and q(-, ■). 

Theorem A. 9. [9] The function Pd(t, x) has the following analytic form, when t — \\x\\\ is even: 

2 ( d\ d/2 \-d\\x\\l\ 
Pd(t,x) = - u ,\ — \ exp<^ ^ >+et(x), 



where \e t (x)\ < min (0(t~ (d+2)/2 ), 0(\\x \\^ 2 r d/2 )}. p d (t,x) = Owhen t - [|ac||i is odd. 
Theorem A. 10. fiTy The function qd(t, x) satisfies the following asymptotic relations: 

• If d = 2, f / 0, and t > then we have 

q 2 {t,x) = 

• Ifd > 3, x 7^ 0, and t > then we have 

qd(t,x) = O 



log [ a? | 



1 



id— 2 



When d > 3, it is not difficult to see that the above asymptotic result is tight by using Markov inequality: 
Corollary A. 11. When d > 3, the function qd(t,x) satisfies the following asymptotic relation for t > 

1 \ 



q d (t, x) = Q 



id— 2 
\2 



Next we show for any random walk that could start near the boundary, waiting for a short period allows 
the walk to both stay away from the boundary and be sufficiently close to where it starts. 

Lemma A. 12. Consider a random walk S over the d-dimensional space V d that starts at x, where x = 
(x\, ...,Xd) is an arbitrary point in the space. Letc= (c\, ...,Cd) be apoint inV d such that \\c — x\\ = G(r). 
Also let t = r 2 . We have 

Pr[St G B(c,r)] = 0(1). (25) 
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Proof. Recall that at each step, the random walk S uniformly selects a neighboring point to move to. We 
may also interpret a move of S as if it first randomly selects an axis to moves along and next decides which 
one of the two directions to take when the axis is fixed. Let T, be the number of the walk's move that are 
along the i-th axis within t steps. Define the event e as: 

\, 1 t m 5 t 

Vi: < Ti < 

2 d~ 4 d 



By Chernoff bounds, we have for any specific i G [d\, 

Pr 

for sufficiently large t. Therefore, 



1 t m 5 t 
< Tj < 

2 d ~ 4 d 



> 1 - exp(-0(i)) > 1 - — 



Prfel > l-d- — >-. 
L J ~ 4d ~ 4 

Let (St)i be the i-th coordinate of the point St- We next compute Pr[5* G B(c, r) | e] 

Pr[5 t G B(c,r) | e] 
= E[Pr[5 t eB(c» |Ti,...,T d) e] | e] 



E 



E 



Pr 



A ($)« e 

ie[d] 



II, .-.,T d ,e 



(By the definition of B(c, r)) 



- r, q + ' 



Ti,e 



The last equality holds because the moves along the i-th axis are independent of the moves along other axes 
when Tj is known. Next, using Corollary IA.8I we have 



Pr 



(S t )i G [ch -r,a + r] 



0(1). 



Therefore, 



Pr[St G B(c,r) | e] = E 



]J Pr (S t )i G [ci-r,Ci + r] 



E 



IJfi(l) 



0(1). 



Finally, we have 



Pr[5 t G B(c,r)] > Pi[S t G B(c,r) | e] • Pr[e] = 0(1). 



□ 



Corollary A. 13. Let r be sufficiently large and r < 2(2/3+6) ' wnere ft ™ an arbitrary constant between 1 
and 80d. Let A = x and B be two points in V d such that ||^4 — < r. Consider two bounded random 
walks S 1 and S 2 in V d that start with A and B respectively. Then, with 0(1) probability, at time t = r 2 , 
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• Sj is at least (3 ■ r away from any of the boundaries, 

• H5/ - AWoo < (ft + 2)r, and 

• [l^-^lloo E (r,3r). 

Proof. Let us first find an arbitrary c = (ci, ...c^) such that 

• For all i E [cZ]: |q — Xi\ = (J3 + l)r, i.e., ||c — a?|| = O(r). 

• For all i E [cZ]: — n + (/3 + l)r < q < n — (/3 + l)r, i.e., c is sufficiently away from the boundary. 

We set up /3 in a way that such c always exists. By Lemma |A.12[ we have Pr[5/ E B(c, r)] = S7(l). Next, 
in case S} E B(c,r), let d(5/) be an arbitrary point such that 

. \d l {Sl)-{S}) l \=2r 

• the distance between d{S}) and any boundary is at least (3r. 

• (/3 + l)r <||d(5 t x ) - BHoo < (/3 + 5)r. 

Again by the way we designed /3, such d{S}) always exists so long as S\ E B(c, r). Using Lemma |A. 121 
again, we have 

Pr[5 t 2 E !(<?(#), r) | 5, 1 E B(c,r)] = 0(1). 



Therefore, we have 



Pr 



5 t 2 E B(d(5i),r)) A (5, 1 E B(c,r))l = 0(1). 



Finally, observe that when ^5 t 2 E B(d(5 t 1 ), r)\ A (5/ E B(c, r)), the three conditions specified in the Corol- 
lary are all met. This completes our proof. □ 

A.3 Mixing time in graphs 

Definition A. 14 (Statistical distance). Let X and Y be two probability distributions over the same support 
P. The statistical distance between X and Y is 

A(X, Y) = max | Pr[X E T] — Pr[F E T]\. 
We also say that the distribution X is e-close to Y if A(X, Y) = e. 

Lemma A. 15 (Mixing time for V 3 ). Consider a random walk that starts at point A for an arbitrary A E V 3 . 
Let nt(A) be the distribution of the walk at time t, and n be the uniform distribution on the nodes in V 3 . Let 
e > 0. When t = 6(n 2 log(l/e)), we have 

A(vr i (^),vr) < e. 

Although the mixing time of high dimensional torus were analyzed, we are not aware of any literature 
that pins down the exact mixing time for V 3 . It is, however, straightforward to derive the mixing time 
in asymptotic form via computing the conductance of V 1 (the one-dimensional grid) and using results on 
mixing times regarding tensoring graphs (e.g., Chapter 5 in [17] and lfl5l ). 
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B Multiple random walks in bounded and unbounded spaces 



Proof of Lemma \2~3\ Let us consider the following two processes Pi and P2 in the same probability space 
(we slightly abuse the terminology "process" to mean the expression of the random tosses that drive all 
random walks of interest). 

1. The process Pi: consider the random walk S(A). We are interested in the event that St(A) visits B 
within time 2t, which occurs with probability q(2t, x). Notice that S(A) is unable to visit B at odd 
steps. 

2. The process P2: consider the random walks S 1 (A) and S 2 (B). We are interested in the event that the 
two walks collide by the time t, which occurs with probability Q(t, x). 

We couple the two random processes as follows. We first construct the single random walk in Pi from 
the two walks in P2. Note that one time step in P2 involves simultaneous moves of the walks S l {A) and 
S 2 {B). Corresponding to this step, the single walk in Pi will be set to move first in the same direction as 
S l (A), and then in the reverse direction from S 2 (B). This way the moves at time t > in P2 are translated 
into the moves at time 2t — 1 and 2t in Pi . The construction can naturally be reversed to map a walk in Pi 
to two walks in P2. This coupling ensures the L\ distance between S 1 and S 2 at time t in P2 is the same 
as the distance between S and B at time 2t in Pi. Note that collision in P2 can only occur at even steps, 
and hence the hitting event in Pi is well-defined. Therefore S 1 and S 2 collide at or before t if and only if S 
visits B at or before 2t. 

Using the bound given in Lemma lA.l II we have for t > 11 """ - 



Q(t,x) = S 



2' 

1 



□ 



Proof of Lemma |2~4] Let "' tj c . be the event that S l and S^ +1 collide at C{ at time step ij (not necessarily 
for the first time) for all i E [j]. Our goal is to bound the following quantity 



Pr 



ti<t 2 <-<tj d,...,Cj 

We cut the time interval into j frames [0, t\], [ti, £2]^ ■■■> [tj-i,tj], so that the random walks with 
different frames are independent. Define D^i be the position of S' L at time tj_i. For notational convenience, 
we let D Q = A\, C = B, and t = 0. 

The event vl/^ '"' 3 G , = 1 implies that in the i-th time interval [£j_i,£j] we have 

1. S % moves from A4 at time to Dj_i at time 

2. at time the walk S j+1 is at Cf_i. 

3. at time ti, the walk S ,J+1 and the walk S l are both at Q. 

By standard results regarding high dimensional random walks (e.g. see Theorem IA.9I ). the probability that 
the first event happens is at most 

3 /± Y^,- m ,-D,^ 
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and the probability that both the second and the third events happen is at most 



(ti-U^) 3 \2irJ ^{ 2(ti-ti-i 

The error term in Theorem IA.9 l is swallowed by the larger leading constants 3 in Equations 1261 and 127] 

As 5*'s walk before time is independent to the walks of S l and S^ 1 between tj_i and ti, the 

probability that the three subevents above happen can be bounded by taking the product of Equation [26] and 

[27] above. 
Let 

d\ d f — 3[||A-i — Ci\\\ + || — i — Ci||l] 



fi = t ~ — exp <^ 111 ; n,z ' " , 1 K for 1< i < j, (28) 

3 /d\ d/2 f-3||A- A-ilIll 

9i =wA^) exp \ ^ } for2 ^^ (29) 

We also let 51 = 1 and g = l~[i<j 5i- We nave 

Pv[3t 1 <...<t j ,C 1 ,...,C j :^;%. = l]< E E E (h9l)(f292)Afj9i) 

ti,...,tj Ci,...,Cj D\,...,Dj—i 

We now carefully bound this sum. Observe that in Equation [28] when ti — is fixed and ||A-l — 
Cj||| + ||Ci_i — Ci II 2 is sufficiently large, the quantity fi asymptotically becomes 

exp(-e(max{||A-i - QWl ||CU - Qg})). 

This motivates us to group the triples {Cj_i, Q, A-i} together, where the triples are covered by balls with 
approximately the same size under the L M norm . Specifically, we let D r be the set of triples (A, B, C) 
where A, B, C G Z 3 and max{p - B||i, \\A - C||i, \\B - C\\i} < r. Also, we say {A, B,C} G <9B r if 
{A, £?, C}gD r - D r _i. Notice by telescoping, we have B r = [j i<r dH>i. We may thus group the variables 
Ci and Di by parameterizing the radii of the balls, 

iv;/,< ...<:/.,,r, Cj^c;:'}- 1 

^ E E E Uw)<Jm)...{Jj9j) 

t\,:;tj Cl,...,Cj Dl,...,Dj-l 

= E EE- EE E ••• E E h-h---fr 9 

C*i6V 3 ri>0r 2 >0 r 3 -_i>0 {Ci,C 2 ,.Di} C 3 ,D 2 : C 3 ,-D 3 -i: ti<...<% 

eaD ri {C2,C3,D 2 } {C7,_i,CV,D 3 _i} 
GfflEV, G9D rj _ 1 

First observe that by the triangle inequality \\A — Ci||i + \\B — Ci||i > ||^4 — B||i = x, and for any vector 
v G R 3 , 

-/=IMIi < 1Mb < IHIi- (30) 
v3 

We have 

||A - Cilll + ||C - CiHl = P - CilH + \\B - dHl > h\\A - d\\l + \\B - dWl) > (31) 

6 b 
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Next, by the triangle inequality again, — Cj+i||i + \\C{ — > ||Dj — C»||i. Meanwhile, we have 

max{||A - Cj+iHi, ||Cj - Cj + i||i, ||A - Qlli} = n. 
Together with the relationship between the L\ and L 2 norms in Equation |30j we obtain 

|| A - Ofill! + ||Ci - C i+ i||l > rf/6 for 1 < i < j. 

Next, for i > 2 we define 



fi 



1 



(U-ti-if eXP \4(ti-ti-i) 



i-i 



and define 



9 



(tl...*i_i) 



1.5 ' 



It is clear that fi < fi for all i > 2. For notational convenience, we let f\ = f\. 
Our goal is now to bound the term 



E E E II- 

ri,...,^-! all triples tl<...<tj \i<j J 

{Ci,Ci+i,Di} 

4;) ' Z Z Z^" 1 i Z^ 2 i -Z^- 1 70- Z^ 



from 



n,.-.,rj-i all i<j: *i <^ *2 <^ 
{Ci,c l+ i,A} from s from 



from g 



Next, let us rearrange the indices and decompose the quantity into different parts (in terms of Tj defined 
below) and express rj as 



Z Z £1.5 

«i>0CiSV 3 1 



Z Z £l-5 
7M>0 C 2 ,D 1 2 

*2>*i {Ci,c 2 ,r>i} 
e3B r , 



Z Z £l-5 

r 2 >0 C 3 .D 2 3 

ta>*2 {c 2 ,c 3 ,r> 2 } 



E\ " fj-1 
£l.5 

tj — l>tj—2 {Cj_i,Cj_ 2 , 



Z Z /i 

*i>*i-i {Cj&j-x, 



T 2 



Ti 

(32) 

Let us briefly interpret the meaning of Tj: this term describes an upper bound for the following two groups 
of events: 

• the collisions between S^ +1 and 5*, <S* +1 , and at time tj, U+i, tj respectively. 
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• the fact that at time t v the walk S i,+1 is at D v for all i < i' < j (i.e., Sf^ 1 = D v ). 
which is conditioned on knowing the values for 

3f = {£1, ...,ij_i,ri, ...,7-i_i,Ci,...,Ci_i,Di, A-2}- 

When Cj_i is known, this information imposes a constraint over the way to enumerate Cj and Dj_i because 
we require {Cj_i, Cj, A-i} £ £®rj_i for a specific rj_i. Therefore, the computation of Yj depends on the 
value of Cj_i. A second constraint imposed from knowing Q is that we need tj_i < tj < ... < tj. Tj does 
not depend on other values in S. In what follows, we write Tj as a function of Cj_i and tj_i. 

Specifically, let us define the function Yj in a forward recursive manner (the summations of rj and tj 
are over integers): 



Sr-i_i>0 



{Cj-i,Cj,Dj-i} 



3-1 

{Ci_i,Ci,Di_i} 



if i = j (base case) 
if 1 < i < j 



(33) 



The variable Ti is the quantity we desire to bound. Let Atj 
to = 0). Let us start with bounding 



if i = 1. 

ti — U-i for all z (and we shall let 



Tj 



E E E M exp 



-rf_i 
4Atj 



We shall first find the total number of {Cj, -Dj-i} pairs so that {Cj, Cj_i, G 9B rj _ 1 . Notice 

that when tj-\ and Cj_i are fixed, at least one of ||Cj_i — Cj-||i, ||Cj_i — Dj_i||i, and \\Cj — -Dj— 1 1| i is 
exactly Tj-\. When ||Cj_i — Dj_i||i = tj-i, the number of possible Dj-i is 4rj_i(rj_i — 1) < 4r|_ 1 . 
An upper bound on the number of possible Cj is 4r|_ 1 . Therefore, when ||Cj_i — Dj_i||i = Tj-\, the 
number of {Cj, Dj_i} pairs is at most 16r|_ 1 . We may similarly analyze the other two cases to find that 
the total number of {Cj, Dj-i} pairs is at most 48r|_ 1 . Thus, we have 



= E E E^73 ex p 



3 f-fj-i 



At? 4At 



1 / ^ _ ._ K / ~ r 



= Et\73 E 3x48r 



5 I 'j'-l 

i-1 eX P 



At,- 3 \rj-i 



3 



< V ( 2 • / 144rf_! exp ( ) dr 



= V— ^ 18432 At 

At, 3 

= 18432x 2 < Co^ 2 , 
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where (q = 18432. The last equality holds because we are considering a time frame of length x 2 and 
therefore Atj < x 2 . Let us explain the derivation in greater detail because similar techniques will be used 

again in the rest of the analysis. Define h(x) = x 5 exp (y—j^-^j ■ The function h(x) is a unimodal function 
with a unique global maximal value. Let xq = argmf x >o h(x). Then we have 



boj +00 

j2Kx) < j>(*) + E w 

xeN X=l X=\XQ~\ 

< / h(x)dx + / h(x)da 
J l J\x \ 

00 



< 2 / h{x)dx 
Jo 



While this bound is quite rough, it suffices for our purpose; the same approach is used to bound the 
summation of unimodal functions elsewhere. The third equality holds because of the following fact, 

00 f x 2 \ 

x 5 exp ( - — \ dx = 64£ 3 (34) 

for any I. (This can be verified through standard software packages such as Mathematica). 
We can prove the following hypothesis for T^: 

for all 1 <£ < j - 2 : T^^-i, < ^'Co +1 ■ ^ • \ • tf-tv 

We shall show this by induction (with the base case, in which t = 0, being proven above). 



3 / -rj-e-2 



y ' ^ „ *-± /At- \ 4At,-_ < _i } t ^+e/2 



Atj-e-^tj-e-i-tj-e-i \ J0 

= xx^m- 1 e ^ 



r .5+e/2 

Atj-e-! T j-£-l 



T j-£-2 



The last inequality holds because 



1 1'°° 1 1 

E TL5+£72 - 2 / ^T^d^^+^TmV?' 
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This completes the induction. Finally, we have 



Next, let \\B - Ci||i = r . By Equation[30l ||5 - Ci[|| > rg/3. By EquationEB we have ||B 
Ci HI + Pi - d||| > x 2 /6. Therefore, we have \\B - C x \\l + pi - Ci[|§ > i(rg + x 2 /2). We have 



E ^exp ^ j 



* E E-p^J- ex P V24tl 



3r§\ / -3x 2 



<- E^( S f ^-p(^)dn,)--p(^) 



< 



ti H 



60 



r(j72 + 1) 



( x 2/ 8 )j/2+l • 

The third inequality holds because 



The last inequality holds because 

roc „2 qc— 1 



for any constant c and real number x. 
We thus conclude that 



T < fin r(j + i) 72+1 30(8^^-^(1 + 1) 

1 - (j - 2)! x 2 «/2+i) ° " (j - 2)\xi 
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When the permutation is considered, we have 



< ^30(8^^(1 + 1) 



< 



< 



< 



(j - 2)lxi 

30j(j-i)(8V2P'cr 1 r(| + i) 




By setting ( = 8\/2~Co < 210000, our lemma follows. □ 

Proof of Corollary 12. 51 Notice first that if S^ +1 (B) and S l (Ai) meet at a time step to, then there exists a 
point A' { with — = 1 such that the walk S l (AQ that mimics the moves of 5* at each step collides 
with S j+1 at time t . 

Therefore, a necessary condition for S^ +1 to meet the rest of agents is that there exist S l (A^), S 2 (A' 2 ), 
S J ,, such that 

• WA'i - Ai\\i = 1 for all i < j. 

• S l mimics the moves of S i at all steps for all i < j. 

• S j+1 collides with all of S 1 ' , &' before time t. 



For any A\, A'p the collision probability is at most ( ) by Lemma [Z4l The total number of possible 



j 



j-tuples A' x , A'- is V . By using a union bound, the probability there exists a j-tuple such that all j walks 
collide with 5 ,J+1 is at most V (^^fj ■ The corollary follows. □ 

We next move to prove Lemma 12761 Since we need to frequently compare bounded random walks with 
their unbounded counterparts, we use S to represent unbounded walks and S to represent bounded walks in 
the rest of this section. 

Our analysis consists of two steps. We first tackle a simpler problem, in which we need to understand 
the probability for a random walk starting from a point near the boundary to visit another point in V 3 within 
a short time frame. We then utilize results from this scenario to prove Lemma 1231 

Lemma B.l. Let V 3 = {— n, ...,n} 3 . Let A and B be two points in V 3 such that A — B = x and the 
distance under norm between A and any boundary is at least 20||x||i. Consider a random walk S(A) 
that starts at A. Let e\ be the event that S{A) is at B at time t. Let e 2 be the event that S(A) hits a boundary 
at or before t. When t = <d(\\x\\ 2 ), we have Pr[e^ A ~>e 2 ] > Cop(t, x)for some constant cq. 

Proof. First, let us couple the random walk S(A) with a standard unbounded random walk S(A) in the 
natural way. Let ej be the event that S(A) is at B at time t and let e 2 be the event that S(A) ever visits a 
boundary at or before time t. When — ief occurs, S(A) and S(A) coincide and Pr[e| A — tef ] = Pr[e| A -ief\. 
On the other hand, we have 

p(t,x) = Pr[e 1 t Ae?]+Pr[ejA^]. 
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Notice that in the event e\ A e 2 , S(A) has to travel from the boundary to B within a time interval shorter 
than t. The distance between the boundary and B is at least 19||x||i. Together with the analytic form of 
p(-,-) in Lemma |A~!9l we have Pr[e* A ef = 1] < max||^|| 1>19 || ;? || 1 p(t, y). Therefore, Pr[e^ A — ief ] > 
p(t, x) - maxj^i^igi^j^ p(t, y). Finally, we have 

Pr[eJ A ^e 2 ] = Pr[eJ A -.ef] > p(t,f) - max p(t,y) > -zp(t,x). 

|M|i>19||x||i 2 

We may use the analytic form of the function p{-,-) (Lemma |A.9b for t = 0([|af||f) to verify the last 
inequality. □ 

Proof of Lemma l26l Let X be the number of collisions between S 1 and S 2 that are before time t and before 
either of them visits a boundary. Also let e(St) be the event that the random walk S ever visits a boundary 
at or before time t. We have 

e[x] = Yl p^^^aw^v^ 2 )] 

t<M\i 

= Y J2 Fr K S t =C)A^e{S})]Pr[{S? = C)A^e{S?)} (two walks are independent) 

t<\\x\\lCdV 

> Yl Pr[(S' t 1 = C) A -■e(S' t 1 )] Pi*[(S 2 = C) A -ie(Sf )] (only focus on a subset of V 3 ) 

t<\\x\\i C:||C-A||i<||5||i 

Since \\C — A\\\ < \\x\\\ and \\A — B\\\ < \\x\\i, we have ||C — < 2||x||i. 
By Lemma |B~T1 

Pr[(^ = C) A > ±p(t, A-C) and Pr[(5 f 2 = C) A -e^ 2 )] > ±p(t, B - C) 

We now have 

£ £ Pr[(^ = C) A -ng(S*)] Pr[(S 2 = C) A ^e(S 2 )] 

K||5||fC:||C-A||i<||s|| a 

^ E Z) ^p(M-C)^M-C) (by LemmaEB 

l<*<[|^|[? C:|[C— -4.|U<li»lli 

= fil V llxll 3 min {p(t,C - A)p(t,B-C)} 

The last equality can be shown by using the analytic form of p(-, •) again (Lemma [A.9I > and the fact that 
\\A - C\\ 2 and \\B - C\\ 2 are in 0(||x|| 2 ). 

Next, let us compute E[X|X > 1], i.e., the expected number of collisions when they collide at least 
once. Upon the first time S 1 and S 2 collide (before either of them visit the boundary), we couple S l and 
S 2 with two unbounded random walks S 1 and S 2 in the natural way respectively. The expected number of 
collisions between S 1 and S 2 for t steps (when they start at the same point) is an upper bound on E[X |X > 
1]. On the other hand, we may couple S 1 and S 2 with a single random walk S in the way described in 
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Lemma lBTTI so that the expected number of collisions between S 1 and S 2 is the expected number of times S 
returns to the point where it starts at. 

Finally, the expected number of return for an unbounded random walk is a constant (which can be 
derived from ^2 t >oP(t,0), where p(-,-)'s analytic form is in Theorem IA.9I ). Therefore, E[X | X > 1] = 
O(l). Now since"E[X] = E[X | X > 1] Pi[X > 1]. Therefore, Pr^n^] = Pv[X > 1] = fi(l/[|sc||i). 

□ 

C Missing proofs for upper and lower bound analysis 



Proof of Lemma 1X6] We first show the good density property holds with high probability. For any specific 
time t < n 2 5 , all the agents are uniformly distributed due to stationarity. For an arbitrary P £ V 3 , and 
i < (^2/^1) log -3 re, define Y(t,P,i) as the number of agents that are in dBi(P) at time t. Notice that 
E[Y(t,P,i)} = \dBi(P)\m/(2n + l) 3 andm^P) > 6E[Y(t, P, i)]. By Chernoff bounds (e.g., the second 
part of Theorem fATT}, 



A log 



-3 



Pv[Y(t,P,i) > rrii] < 2~'" H < exp(-0.65m;) < exp ^0.65 " 1 — -mlog 5 nJ < exp (- 0.65 log 2 n) 
for sufficiently large n. Next, by a union bound, 

Pr[A = 0] < J2Pr[Y(t,P,i) > rm] < (n 2 ' 5 (2n + l) 3 (log" 3 n)l 2 /h) exp(-0.65 log 2 n) <exp(--log 2 

t,P,i 

Therefore, we have Pi[Dt = 1] > 1 — exp(— \ log 2 n). 

To show the diffusion process has the small islands property with high probability, we mimic the proof 
of Lemma 6 in |[T3ll . Let Bj, be the event that there exists an island with parameter 7 = l\ log -1 n that has 
at least k agents. The quantity Prfi^] is upper bounded by the probability that Gt{^) contains a tree of k 
vertices of A as a subgraph. Since k k ~ 2 is the number of unrooted labeled trees on k nodes, and 7 3 /n 3 is 
an upper bound to the probability that a given agent lie within distance 7 from another given agent, we have 
that 

*™ * (:>" (Sf 1 * (t)* ■ * 1 - 2 (if - £ ■ o^- 3 "- 1 '- 

By setting k = 31ogn + 1, we have Prfl^] < exp{— 71ogn • log log n}. Finally, we apply a union bound 
across all agents and all time steps. Hence Prfl^ = 1] > 1 — n 2,5 m exp(- 7 log n log log n). 

Finally, consider the short travel distance property. For any fixed i G [m] and t\ < t 2 < n 2 5 such that 
h - h < ll log -12 re, we have Pr^S^ - S\ 2 ||oo > h log -4 n] < exp(- log 2 n) by Lemma |A31 There is 
a factor of 3 lost when we translate the metric from Loo-norm to Li-norm. The total number of possible i, 
t\, and t 2 are rrere 5 . Next we may apply a union bound across all these possible i, t\, and t 2 triples. We have 
Pr[L t = 0] < mn 5 exp(— log 2 re). The lemma follows by combing the three results together with one more 
union bound. □ 



Proof of Lemma 1431 Fix a time t and let rh be the number of agents in an arbitrary subcube of size £ 2 x 
£ 2 x t 2 . We have E[m] > l 2 log 2 n/27 > log 2 re. Therefore, by Chernoff bounds (Theorem lA.lt . Pr[?fi £ 
[|E[m], §E[m]] > 1-2 exp(— log 2 n/12). Now the total number of possible subcubes is at most (2re+ l) 3 
and the total number of time steps is n 2 5 . By a union bound, we have 

Pr[L> = 0] < (2n + l) 3 • re 2 ' 5 • 2 exp( — - log 2 re) < exp( — - log 2 re) 

12 15 

for sufficiently large n. □ 
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D Near optimal bounds for the isoperimetric problem for closed hypercubes 

This section studies an isoperimetric problem we need for our upper bound analysis. In what follows, 
we let£ d = {l,2,...,6} d ,where& is an arbitrary integer. 

Our first lemma builds up a matching between the subcubes in the interior surface and those in the 
exterior surface. This result allows us to focus on one type of surface for the purpose of understanding the 
completion time for the diffusion process. 

Lemma D.l. Let Q be an arbitrary subset of £ d . Define dQ and dQ as the interior and exterior surfaces of 
Q (i.e. the set of points in Q that neighbor with Q c and the set of points in Q c that neighbor with Q resp.; u 
and v are neighbors if \\u — v\\i = 1). Define a bipartite graph with nodes denoting dQ and dQ, in which 
an edge (u, v), u G dQ, v G dQ exists whenever u and v are neighbors. Then there exists a matching M in 
this graph with \M\ > \dQ\/(4d - 1). 

Proof. We prove this statement by explicitly constructing the matching M. First notice that the degree of 
each node is in the range [1, 2d]. We build M iteratively. Each time, we pick an edge (u, v) G E and place 
the edge into M. We then remove nodes u, v from L and R respectively as well as all edges incident to 
them. Since the degrees of u, v are bounded by 2d, we will remove at most Ad — 1 edges from E. We 
continue this process until no edge is left. Clearly, the edges we place into M form a matching. Because 
there are at least \dQ\ number of edges by the lower bound of degrees, we conclude that \M\ > jgL. □ 

Theorem D.2. Let Q be an arbitrary subset of There exists a pair of constants a(d) > 1/2 and 
(3(d) > 0, such that: 



Specifically, /3(3) > 0.36. 

The isoperimetric problem over <t d was studied in H, in which the optimal structure of Q that min- 
imizes | dQ | is presented. Here, we provide another asymptotically optimal proof based on a recursive 
argument. This proof could be of independent interest. 

To begin, let us prove the special case d = 2. The analysis for this case demonstrates important ideas 
that are needed for showing the case for general d. 

Lemma D.3. Let Q be an arbitrary subset of £?. If\Q\ < |6 2 , we have 



Proof. Let V = \Q\ and X(i) be the collection of lattice points in £ 2 whose x coordinates are i. Also we 
refer V(i) := X(i) n Q as the ith stripe of Q. Define 



if\Q\ < a(d) ■ |Z' 



d 



a(d) ■ b d , 



then \dQ\ > p{d)\Q\ 



(d-l)/d 



dQ\ > \\Q 
5 



1/2 



i* = arg max \V(i)\ and i* = argmin |V(z)|. 




< \V{i)\ < \V(i*)\ < b. 
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On the other hand, when < \V(i)\ < b, there is at least one element of X(i) that is also in dQ. Since the 

v 

\V(i*)\- 



cardinality of Q is V, the number of non-empty stripes in Q is at least \ v Y-,\\ ■ Hence we have 



Case 2. \V(i*)\ > \J^f - By an averaging argument, |F(i*)| < V/b. Using the fact that V < §6 2 , we have 



\V(i*)\ < ^Jjv. 

Next we show that dQ > \V(i*)\ — |V(i*)|- Consider an arbitrary j such that (i*,j) G V(i*) and 
(i*,j) ^ V(i*). Since (i*,j) G Q and (i*,j) ^ <5, there exists a lattice point on the "line segment" 
{(hj) '■ i G •••,«*}} that is in <9C?. 

Finally, we have 

dg>\v(i*)\-\v(Q\ > li?U>k 



2 V 3 ; ~ 5 

□ 

We use induction to prove Theorem ID.2I Our idea of proving general d is similar to the case d = 2. 
First, we let be the collection of lattice points in <t d whose first coordinates are i and V(i) = X(i)d Q. 
Next, we also define i* = argmaxj \ V(i)\ and = argminj |V(i)|. Then, we mimic the analysis for 
the case d = 2 and discuss two possible cases: when |V(i*)| is small and when |V(i*)| is large. When 
|V(i*)|, we need to invoke results on lower dimension cases. When |V(i*)| is large, we shall show that 
|V(z*)| — |V(i*)| is a lower bound on the size of dG, which is sufficient for proving the theorem. 

Let us proceed with the following lemma, which is the main vehicle for analyzing the case |V(i*)| is 
large. 

Lemma D.4. Let Q be an arbitrary subset of <t d . We have 

\dQ\>\V{?)\-\V{U)\. 

Proof. First, define the set A as 

A = {(i 2 ,i 3 , ...,»«{) G d 1 ' 1 ((**,*2,»3,-,*d) G V(i*))A{(u,i 2 ,i 3 ,-,id) t V(Q)} 

Notice that by the definitions of V(i*) and V(i*), we have |A| > |V(i*)| — Next, we show that for 

any (i 2 , —,id) 6 A, there exists an i\ such that (ii, ...,id) G dQ, which immediately implies the lemma. 

Fix a (d — l)-tuple (i2,-~,id) G A. Observe that (£*, i 2 , id) G V(i*) C Q and (i*, i 2 , id) 4- 
V(i#) and thus (i*,i 2 , ■■■■,id) £ G- Let us walk from the point (i*, i 2 , id) to the point (i*, i 2 , id). 
Because we start with an interior point of Q and end at a point outside Q, we leave the polytope Q at least 
once. Hence, there exists an i\ such that (ii, id) G dG. □ 

Now we are ready to prove the main theorem. 

Proof of Theorem \D. 21 We prove by induction on d. Specifically, we show that for any d and any Q(d) C <L d , 
there exists a pair of constants (that depends only on d) a(d) > 1/2 and (3(d) > such that 



if \G\ < a(d)\£ d \, then \dQ(d)\ > P(d)\Q\ 
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The base case was considered in Lemma |P3l Now let us assume the theorem holds up to the d-dimensional 
space. We now prove the d + 1 dimensional case. 

Our a(d + 1) and j3(d + 1) are set up in the following way: 

a(d+l) = a(d)/2 + l/4 

ftd+1) = mini 2® d -( Q (d + l))^, «^±jlgi l 05) 
l(a(d+l))^+T 

Let T = ( 3 ) and consider the following two cases. 



,(a(ci+l))3+T, 

Owe i. | V(i*)| < T. Our a(d + 1) is set up in a way that when V < a(d + l)b d+1 , T < a(d)b d . Next, we 
invoke the result for d dimensional case on each V(i),i G [b]. Notice that a lattice on the exterior surface of 
V(i) in the space <t d is also on the exterior surface of Q. Let us call the size of the exterior surface of V(i) 
as \dV(i)\. By induction hypothesis, we have \dV(i)\ > /3(d)\V(i)\^r . Note also YU<b I^WI = 
Next, define f(x) = x <* , which is a concave function. We have 

\ag\ > "£\dv(i)\ 

i<b 

> ^2/3(d)f{\V(i)\) (induction hypothesis) 

i<b 

> ^ jj /(r) (|V(»*)| < T and using the concave properties of /(•)) 



i<b 

P(d)v 



f(T) 



T 

B(d)(a(d + d 
= h\ j\ v z tl v i+T ( Us i n g the definition of T) 

(a(d))- d 

> P(d + l)V"*+* (by the construction of /3(d)) 



Case 2. When |V(i*)| > T. By Lemma ITJ41 \dQ\ > |V(i*)| — |V(i*)|. Also by an averaging argument we 
have |V(i*)| < V/b. The theorem then follows. □ 

E Existing techniques 

This section briefly reviews existing lower bound and upper bound analysis techniques and explains 
the difficulties in generalizing them to the three dimensional case. 

E.l Lower bound 

Two existing approaches that can potentially be adopted to our lower bound analysis are: 

1 . Geometrically understand the growth rate of the smallest ball that covers all the infected agents (here- 
after, the smallest covering ball). An upper bound on the ball's growth rate translates into a lower 
bound on the completion time for diffusion. Examples of this approach include ElfTOll. 

2. Analyze the interaction of the agents locally to conclude that the influence of infection is constrained 
to a small region around the initially infected agent, over a small time increment. A union bound or 
recursive argument is then applied to give a global result. This approach is exemplified by lfT3ll . 
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Let us start with the first approach. Alves et al. and Kesten et al. ||2l OH assume the density of the 
agents is a constant; recall that the density of the agents is the ratio between the total number of agents and 
the volume of the space. Their model has infinite space, and hence there is no size parameter n. With this 
assumption, they obtain that the radius of the smallest covering ball grows linearly in time almost surely. 
Translating to our setting, an o(l) density of agents would lead to a growth rate that is also linear in time 
t but scales in some way with the density. Directly applying a linear growth rate would still give a valid 
lower bound of order $7 (to) on the diffusion time, but this is substantially worse than the bound we need. 
One potential way to improve their argument is to analyze the scaling of the growth rate with respect to the 
density. While this approach may well be feasible, it is by no means immediate. For example, the analysis of 
EE)] appears to depend on the fact that two nearby agents have constant probability to meet within a small 
number of steps, which leads to the conclusion that uninfected agents near the smallest covering ball are 
quickly infected. This requires crucially that the density of agents is constant, and relaxing this assumption 
to o(l) density appears non-trivial. 

We have chosen instead to follow the technique developed by Pettarin et al. fl3l . extending it via our 
diffusion tree argument. We now argue that this extension appears necessary. Recall the island graph at time 
t defined in Definition 13 .41 Pettarin et al.'s approach can be summarized by the following three steps: 

1. At any time step, the island graph 6^(7) is constructed, where 7 is an appropriately selected parameter. 

2. Specify St such that within St time increment, w.h.p. a piece of virus is unable to travel from one 
island to another. 

3. Argue that the information has to travel across rt/7 islands sequentially to complete the diffusion so 
that a lower bound - • St is established. The parameter rt/7 is asymptotically optimal because the 
space V 3 cannot pack more than 71/7 islands along any directions (including those that are not parallel 
to the axes). 

Now let us discuss the internal constraints over the parameters under this framework that prevents us 
from optimizing the lower bound for the 3-dimensional case. 

At step 1, we need to decide 7. When 7 is set to be larger than n-m -1 / 3 i.e. the critical percolation point 
fl3l . Gt (7) becomes connected w.h.p. and the subsequent arguments break down. Therefore, 7 < n-m -1 / 3 . 

At step 2, for illustration let us only focus on two islands Isdi and Isd2, and let ai € Isdi and &2 6 Isd2 
be two arbitrary agents each from the two islands. We now need to decide on the value of St. We are facing 
two options: 

1. If St is set to be smaller than 7 2 , then w.h.p. ai and &2 do not meet in time St lfl3l . 

2. If St is larger than 7 2 , then with probability 8(1/7), &\ and &2 will meet in time St (Lemma 1231) . 

We consider both options to examine the quality of lower bounds we can get, using step 3 above. For 
the first option, the lower bound we get is 717 < to 2 ■ m -1 / 3 , which is suboptimal. For instance when 
m = n 1 ' 5 , the lower bound is to 15 as opposed to fJ(n L75 ). For the second option, regardless of the choice 
of St, the lower bound always fails to hold with probability 0(1/7) = ^■(m 1 / 3 jn) and so step 2 cannot be 
satisfied with high probability. 

Our analysis corresponds to setting St large, but doing a more careful analysis on the local infected 
region by considering a branching process that represents a historical trace of the infection. Our island 
diffusion rale is correspondingly modified from the rule of lT3~3Tl to control the growth rate of this branching 
process. 
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E.2 Upper bound 

We also explain why existing upper bound techniques such as those from ||6l [Tjj do not appear to gen- 
eralize immediately to the three dimensional case. The analyses in (6J [T3j , which are based on percolation, 
follow a proof strategy that contains two steps: 

1. Let ai be the initially infected agent. Identify a ball B (under L ro norm) of radius r that covers ai's 
initial position so that after t\ time steps, where t\ is a parameter to be decided, a constant portion 
of the agents in B become infected (i.e. fraction of infected agents to total number of agents in B is 
0(1)). Moreover, these infected agents are well clustered i.e. at distance 0(r) from the ball B. 

2. Show that if a ball B' has a constant portion of infected agents at time t, then at t + ti, all adjacent 
balls with the same radius will also have a constant portion of infected agents. Here, £2 is a parameter 
to be decided. Moreover, these newly infected agents are well clustered i.e. at distance 0(r) from the 
balls. 

One usually also needs a good density condition i.e. agent density in any r-ball is 0(m(r /n) d ). By 
repeatedly applying the second step, one can establish an upper bound on the time that all balls in V 3 
have constant portion of infected agents. Once this happens, usually it becomes straightforward to find the 
diffusion time. The asymptotic upper bound will be ^ • t% + 1\. 

Let us explain this in more detail for the case d = 2. Assume good density condition. First, we 
need to set £2 = 0(r 2 ) so that the newly infected agents at step 2 are well clustered. This ensures that 
the infected agents do not scatter uncontrollably outside a distance from the ball and jeopardize our next 
recursion. We now sketch a bound on r. Consider step 2. Suppose the number of infected agents in B' at t 
is m(r/n) 2 x 0(1). By our choice £2 = 0(f 2 ), each infected agent in B' has probability 0(1) to meet each 
agent in the adjacent ball (by using Lemma 1 in lfl3ll ). Therefore, the expected number of infections in the 
adjacent ball is given by 

m(r/n) 2 x 0(1) x m{r/nf x 0(1) 



# of infected agents in B' # of uninfected agents infection prob. 

in an adjacent ball 

which, by the requirement of step 2, should be equal to m(r/n) 2 x 0(1). This gives r = Q(y / n 2 jra). Note 
that this also leads to the condition that the number of infected agents in B' at t and the adjacent ball at t + £2 
are both 0(1). 

Now set ti = 0(r 2 ) and so the number of infected agents in B at time t\ is m(r/n) 2 x 0(1) = 0(1). 
Note that both steps 1 and 2 are now satisfied. By recursively applying the second step, we can see that by 
time ~ ■ t2 + t% = 0(n 2 /y / m) all the balls in V 2 will have m{r/n) 2 x 0(1) infected agents. Hence in 
the same order of time period 0(n 2 / '^/rn), all the agents in V 2 will be infected. This time period gives the 
optimal upper bound of the diffusion time for d = 2. 

We now argue that this strategy does not work for d = 3. Let us attempt to mimic the above argument 
step by step. Again set £2 = 0(r 2 ) so that the infected agents are well clustered. Next, note that in contrast 
to the two-dimensional case, Lemma |2~31 states that the meeting probability of two random walks in V 3 with 
initial distance r apart within time 0(r 2 ) is 0(l/r). Hence, in light of step 2, we require 

m(r/n) 3 x 0(1) x m(r/n) 3 x 0(l/r) = m(r/nf x 0(1) 

# of infected agents in B' # of uninfected agents infection prob. desired # of 

in an adjacent ball infections 
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which gives r = 8(y n 3 /m). Note that the number of infected agents in B' at t and that of the adjacent 
balls at t + £2 m step 2 are now both m(r/n) 3 x 0(1) = 0(y / n 3 /m) = 0(r). 

We now try to set an appropriate value for t\. First, note that step 1 requires the number of infected 
agents in B at time t\ being 0(r). Then the question is to find the approximate time for one initially infected 
agent to infect 0(r) agents that are from B. Moreover, we need that these infected agents do not travel at 
distance outside f2(r) in the same time period. 

To give a bound for this t\, let us look into the method of lPT3l . Note that in the case of d = 2, the 
number of agents in B at any time is 0(1). In this case, |[T3ll suggests chopping the time t\ into intervals each 
of length 0(r 2 ). During each of these intervals, one only focuses on a pair of agents from B and see if they 
meet each other; this method aims to reduce the analysis of correlation among multiple agents' meetings, a 
complicated quantity, to a sequence of independent problems that involve only the meeting of two random 
walks. Since there are only 0(1) such pair combinations, and that each such meeting probability is 0(1), a 
t\ = 0(r 2 ) is enough to guarantee that the number of infected agents is 0(1). Also these infected agents 
are well clustered at B. Thus the argument works well for d = 2. 

However, such an argument breaks down for d = 3 because now we are required to have 0(r) infected 
agents at t\, and the meeting probability between any two agents is 0(l/r). As a result the following 
tradeoffs cannot be balanced: 1) t\ is set to be 0(r 2 ) so that the infected agents are well clustered, but the 
number of infected agents at t\ will only be 0(1); 2) t\ is set to be w(r 2 ), but then the infected agents are 
not well clustered and may not constitute 0(r) of infected agents within B at t\. The first tradeoff appears if 
one uses the chopping argument of [ 13 ]: divide t\ into intervals of length 0(r 2 ). For each interval, observe 
the number of meetings between any infected and uninfected agents. This gives an expected total number 
of infections at t\ as r • 0(l/r) = 0(1), which is less than the required number of 0(r). Secondly, setting 
t\ = uj{r 2 ) boosts up the number of infected agents, but also increases the chance that an infected agent 
escapes from the vicinity of B. An accurate analysis of these two effects seems highly non-trivial and does 
not follow from the existing results of lfT3l . 

Finally, we mention the work of Clementi et al. [6] to deal with issues similar to above. At each step, 
conditioned on the positions of the infected agents, the infection event of each uninfected agent becomes 
independent of each other. The change in the infected population over time can then be analyzed. However, 
such analysis is possible in [6 ] because the agents in their model can jump at a distance Q(^/n) at each step. 
This leads to much less serial dependence for each agent and consequently requires less effort in keeping 
track of each agent's position. These phenomena, unfortunately, do not apply to our settings. 

F An example of the diffusion tree 



56 



t = 



isd (a t ,l t log -1 «) The diffusion tree 

X 




The diffusion tree 



t = 20 





© 


© 


© 






© 


® 


© 






\ © 




— ©, 






direct child 



hd 20 (a s ,l, log" 1 n) 



^ = 40 



bd„(a n ,i, log - ' n) The diffusion tree 



© 
® 

© 



© 



hd m (a„l, log" n) 



The diffusion tree 



© ® 



© 




t = 60 



w 

direct child 



© 






© (® > 




© v_y 


© 


® * 
© 




© 


© 


© 




direct child 



/«/„(<!,,/, log 'n) 



Figure 1: An example of the diffusion process and its corresponding diffusion tree at t = 0,20,40,60. 
Assume no collisions happen beyond these 4 time steps. 

At t = , the agent ai is initially infected. Since Isdo(ai, l\ log" 1 n) = {ai, a2, a3, 84, as}, the 
agents a2, a3, a4, and as are all considered infected at t = 0. Also, r does not have a direct child. 

At t = 20 , the agent ai meets as. Since Isd2o(ai, i\ log -1 n) = {ai, &i, aio}, a7 and aio are 
also infected. At this time step, dchild(ai) = {as} and child(ai) = {as, ay, aio}. 

At t = 40 , a4 meets an and a$ meets a7; Isd4o(a4, £1 log" 1 n) = {a4,an} and Isd4o(a7, i\ log" 1 n) = 
{a6,ay}. At this time step, child4(a4) = dchild(a4) = {an} and child(ay) = dchild(a7) = Notice 
that an € Fi and G F3. These two generations grow simultaneously at t = 40. 

At t = 60 , ai meets ag. Isd6o(ag) = {ai,ag}. We have ag G child(ai) and ag G dchild(ai). 
Also, notice that ai now contains two direct children. 
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